Prior work has shown that structural supervision helps English language models learn generalizations about syntactic phenomena such as subject-verb agreement. However, it remains unclear if such an inductive bias would also improve language models' ability to learn grammatical dependencies in typologically different languages. Here we investigate this question in Mandarin Chinese, which has a logographic, largely syllable-based writing system; different word order; and sparser morphology than English. We train LSTMs, Recurrent Neural Network Grammars, Transformer language models, and Transformer-parameterized generative parsing models on two Mandarin Chinese datasets of different sizes. We evaluate the models' ability to learn different aspects of Mandarin grammar that assess syntactic and semantic relationships. We find suggestive evidence that structural supervision helps with representing syntactic state across intervening content and improves performance in low-data settings, suggesting that the benefits of hierarchical inductive biases in acquiring dependency relationships may extend beyond English.
翻译:先前的工作表明,结构性监督有助于英语语言模型学习对主题动词协议等综合现象的概括化,然而,尚不清楚这种感性偏向是否也会提高语言模型学习类型不同语言的语法依赖性的能力。这里我们用中文来调查这一问题,中文有逻辑学,基本上基于可互换的书写系统;不同的单词顺序;以及比英语更稀少的形态学。我们培训LSTMS、经常性神经网络格马斯、变换语言模型以及变换式单向型中国汉语两个不同大小的曼达林语数据集的基因分割模型。我们评估了这些模型学习曼达林语语不同方面的能力,以评估合成和语义关系。我们发现有证据表明,结构性监督有助于代表干预内容之间的合成状态,提高低数据环境的性能,表明在获得依赖关系方面等级感性偏见的好处可能超越了英语。