Recursive processing in sentence comprehension is considered a hallmark of human linguistic abilities. However, its underlying neural mechanisms remain largely unknown. We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing, namely the storing of grammatical number and gender information in working memory and its use in long-distance agreement (e.g., capturing the correct number agreement between subject and verb when they are separated by other phrases). Although the network, a recurrent architecture with Long Short-Term Memory units, was solely trained to predict the next word in a large corpus, analysis showed the emergence of a very sparse set of specialized units that successfully handled local and long-distance syntactic agreement for grammatical number. However, the simulations also showed that this mechanism does not support full recursion and fails with some long-range embedded dependencies. We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns, with or without embedding. Human and model error patterns were remarkably similar, showing that the model echoes various effects observed in human data. However, a key difference was that, with embedded long-range dependencies, humans remained above chance level, while the model's systematic errors brought it below chance. Overall, our study shows that exploring the ways in which modern artificial neural networks process sentences leads to precise and testable hypotheses about human linguistic performance.
翻译:刑罚理解的精确处理被认为是人类语言能力的标志,然而,其内在神经机制在很大程度上仍然不为人所知。我们研究了一个受过“深入学习”方法培训的现代人工神经网络是否模仿了人类句子处理的一个核心方面,即在工作记忆中储存语法数字和性别信息,并将其用于长距离协议(例如,获取主题和动词之间的正确数字协议,当它们被其他短语隔开时,在主题和动词之间获取正确的数字协议)。虽然这个网络是一个带有长期短期记忆单位的经常性结构,仅受过在大体中预测下一个词的培训,但分析显示出现了一套非常稀少的专门单位,这些单位成功地处理当地和长距离的语法合成协议,用于语法数字。然而,模拟还表明,这一机制并不支持完全重复,而且由于某些长期的相互依存关系,我们用一种行为模型来测试模型预测模型的预测,即人类在判决中发现违反次数协议的情况,在多个无或没有嵌入的单数状态中,在人类和不留置内置的内脏状态中,分析显示出现了一套非常稀少的专门单位。 人类和模型的轨误判过程显示,在人类测算模式上显示,在人类测算结果中,但人类测算模式的模型显示,在人类测算结果仍然显示,在人类测算的甚甚甚甚甚甚甚甚甚甚甚甚甚甚甚甚甚甚甚甚甚甚甚深的机。