Microsoft's AI System Can Summarize Lengthy News

Author


Shortening paragraphs into sentences is not an easy task for AI until recently, but data analysts at Microsoft have recently demonstrated the opposite.

To summarize paragraphs into shorter sentences is not an easy task for Artificial Intelligence (AI). Because of the requirement of one deep understanding of the text, which most current natural language processing machines cannot do. This was deemed impossible until recently, as data analysts at Microsoft have recently demonstrated the opposite.

In the paper recently announced on Arxiv.org about “Structured Neural Summarization”, researchers at Cambridge-based Microsoft Research illustrate how AI framework can understand the relationship in "weakly structured" text, which contributes to its outperformance over the current Neuro-linguistic programming (NLP) in summarizing texts.

This reminds us of Primer, a system that uses AI to analyze and combine a variety of documents. However, Microsoft’s AI can reach a much broader use.

It is stated by the researchers that the task of summarization, which involves compressing a complex and considerable input into an abbreviation that still contains the central idea of the input, is classical in terms of processing languages. And automatic summarizing demands a self-learning machine to pinpoint the important parts of a text and the connections between them while omitting the unimportant ones. Nonetheless, while standard methods are supposed to handle the random relationship between words, they tend to fail to handle lengthy texts. Also, they easily get distracted by simple interference.

Their solution includes two steps, involving a developed sequence encoder, which is an AI prototype handling a string of characters and anticipate the following sequence of the previous sequence based on the preceding characters of the target sequence. And a neural network that could directly learn from diagram representations of annotated natural speech.

The composite systems exploited a sequence encoder to provide “rich input” to the diagram network. “Rich input” includes a bidirectional long short-term memory network (LSTM) and a string sequence graph neural network (GNN) extension, along with an LSTM decoder with a pointer network extension. In other words, bidirectional LSTMs are classified as a repetitively-learning neural network which is competent to learn lengthy texts.

They had Sequence GNNs do three tasks: confirm a code function name with the source code (Method naming); make predictions and describe a method's functionality  (Method doc), and create a unique summarization of a provided text input (NL summarization).

The team picked two datasets for task one. The first one is a minor java dataset, which was divided for training, validating and testing. The second dataset was produced from 23 C# open source projects on GitHub. For the second task, the aforementioned 23 open source projects in C# was reused. As for the third one, the researchers browsed articles from Daily Mail and CNN.

How was the graph, which the AI prototype extracts data from, created? Firstly, the researchers must split the input into separate identifier tokens (as well as subtokens) then they connected the tokens to construct the graphs. The token could be in any form such as classes, methods, variables, and others, while the text was processed by Standford’s CoreNLP tokenization tool.

So what are the evaluations for the AI system performance?

The Sequenced GNNs accomplished the best performance in the Method naming on both datasets C# and Java based on F-measure (a statistical metric that assesses performance within the range of 0-1) with a score of 63.4 and 51.4 relatively. A slightly worse performance was spotted in Method doc, however, the researchers think it's due to the length of the predictions. As for the last test – the NL summarization, it came up short; even so, the researchers claim there is something related to the training objective and the simplistic decoder. However, they said that it can be further developed in further experiments.

The scientists are excited about their fundamental improvement and eager to go further in their mixed graphs modeling in the variety of exercise beyond natural language. They stated that the key to furtherly boost the performance of deep learning systems is to use explicit relationship modeling to induce learning biases.