To model behavioral and neural correlates of language comprehension in naturalistic environments, researchers have turned to broad-coverage tools from natural-language processing and machine learning. Where syntactic structure is explicitly modeled, prior work has relied predominantly on context-free grammars (CFG), yet such formalisms are not sufficiently expressive for human languages. Combinatory Categorial Grammars (CCGs) are sufficiently expressive directly compositional models of grammar with flexible constituency that affords incremental interpretation. In this work we evaluate whether a more expressive CCG provides a better model than a CFG for human neural signals collected with fMRI while participants listen to an audiobook story. We further test between variants of CCG that differ in how they handle optional adjuncts. These evaluations are carried out against a baseline that includes estimates of next-word predictability from a Transformer neural network language model. Such a comparison reveals unique contributions of CCG structure-building predominantly in the left posterior temporal lobe: CCG-derived measures offer a superior fit to neural signals compared to those derived from a CFG. These effects are spatially distinct from bilateral superior temporal effects that are unique to predictability. Neural effects for structure-building are thus separable from predictability during naturalistic listening, and those effects are best characterized by a grammar whose expressive power is motivated on independent linguistic grounds.
翻译:对于自然环境中语言理解的模型和神经相关关系而言,研究人员转向了来自自然语言处理和机器学习的广泛覆盖工具。在明确建模的综合结构中,先前的工作主要依赖无背景语法(CFG),然而,这种形式主义对于人类语言来说还不够清晰。综合语法(CCG)是足够直观的语法组合直接构成模式,具有灵活的用户群,能够提供增量解释。在这项工作中,我们评估一个更直观的CCG是否提供了比与FMRI收集的人类神经信号的CFG更好的模型,而参与者则听听听听听听音书故事。我们进一步测试CCG的变体,这些变体在如何处理任选用的附加语法方面各不相同。这些评价是在一个基线的基础上进行的,其中包括来自变换神经网络语言模型语言模型的下一字的可预测性估计数。这种比较表明,CCG结构的构建独特贡献主要在左表层独立的时系:CCG的测量措施提供了一种与从具有动机的直观性语言信号的高级信号的优于具有独特性的时间空间空间结构的影响。这些影响是双边空间结构结构的、具有独特性效果。这些影响。