Written by By Jennifer Manfrin, CNN |
Many scientists have to use the power of algorithms to help pick up pieces of information lost through messy text errors, dramatic presentation slides and revisions.
But this “effortless power of text to pencil,” as the University of California-Los Angeles’ (UCLA) John Lieder calls it, seems to be waning. New research shows that the Information Generating Process (IGP) algorithm, which was unveiled in 2013, isn’t very effective when it comes to selecting and representing accurate, concise summaries of data.
Making sense of data: Linguists look for better language analysis tools
Several researchers including Saul Kaplan, a professor at UCLA, Andrew Packer and Xiao-Deng Ba at the University of Cambridge and a German study by Peter Woroniecki at Munich’s Technical University have put these findings to the test. When building sentences from the data, the researchers tested IGP to see whether it could generate a sentence from disparate data points using a more reliable method than subtracting and multiplying. When reworked for the results, the IGP seemed to work better.
The algorithm, called the MetaSimulator, was developed at UCLA by Fabien Leontieux and Mateo Bianchi in 2013. The idea for IGP evolved from their previously developed meta-citation system. The more-reliable meta-citation system processes three-dimensional, multidimensional data. It scans the Internet, grades publicly-available data, finds references to documents, forms pre-existing citations and creates a form of “background knowledge.”
Computer-generated crowdsourced data sets now collecting more data than humans
While conventional text retrieval by text reduction uses reduction of text by subtracting factors that are important to an article’s meaning, IGP uses decline in the same data. The system checks how much a user’s first two grammar hints change the whole story, which leads to the “inceptive selection algorithm” in the first place.
The authors of the latest research, published in Harvard Business Review, reasoned that perhaps people aren’t applying IGP the way the researchers did. Instead of using it to create a more concise paragraph, they may be using it as a “faster, more efficient way to load a disorganized page with decimals and other information it was meant to find in a copybook,” wrote the authors.
The IGP algorithm works by generating declarative sentences and then asking what the meaning of the words are. Common information points include or are names, dates, keywords and sites. The system also contains “background knowledge” that re-evaluates the sources and the new data. If the words tend to match their own content, the sentence will be composed well. In other words, it may not use all data points.
“Our measurements indicate that IGP is relatively poor at outputting written summaries of information on a broad scale, unless either the data used is cut significantly or there is a foreword in the language selected,” said Penny Demmick, a UCLA doctoral student and the study’s co-lead author.
New haptic speaker
The researchers say that although the team’s findings prove that IGP is not as accurate as it could be, there are some other advantages.
“A leading critic of IGP claimed that IGP would show that Google search results are better than standard responses,” said Demmick. “We found no difference. People rely on search results more in situations where they can’t see what the original document is, like looking at handwritten lab reports. Similarly, the combinatorial editing methods may have to be used to recover the appropriate points in a paper, such as breaking sentences and feeding one into another.”
By 2030, computing will be smarter than ever
Still, there is hope that the researchers’ findings will lead to improved algorithms.
“These findings suggest that more lay-critics should be trained to use IGP, especially for writing more effective online articles,” said Dr. Branden Hollander, a visiting researcher in computer science and engineering from the University of Freiburg. “Whether the artificial intelligence system is improved in the future is hard to say, but giving the IGP systems the word recognition algorithms of humans could lead to much improved uses of Google search results.”