Google is giving developers access to its artificial intelligence (AI) programming toolkit SyntaxNet – including the English language plug-in ‘Parsey McParseface’.
SyntaxNet is an open-source neural network framework which provides a foundation for Natural Language Understanding (NLU) systems.
Parsey McParseface is an English parser that Google has trained to analyze English text – it is the most accurate model of its kind in the world.
Parsey can explain the functional role of each word in a sentence.
It is difficult for computers to get parsing right because of the amount of ambiguity found in human languages. According to Google, it is common for sentences of around 20 words to have thousands of different possible syntactic structures.
— Google Research (@googleresearch) May 12, 2016
Google Senior staff research scientist Slav Petrov said in a blog post: “Parsey McParseface is built on powerful machine learning algorithms that learn to analyze the linguistic structure of language, and that can explain the functional role of each word in a given sentence.”
He added: “SyntaxNet applies neural networks to the ambiguity problem. An input sentence is processed from left to right, with dependencies between words being incrementally added as each word in the sentence is considered.
“At each point in processing many decisions may be possible — due to ambiguity — and a neural network gives scores for competing decisions based on their plausibility.”
“While there are no explicit studies in the literature about human performance, we know from our in-house annotation projects that linguists trained for this task agree in 96-97% of the cases. This suggests that we are approaching human performance — but only on well-formed text,” he added.