Since language models are used to model a wide variety of languages, it is natural to ask whether the neural architectures used for the task have inductive biases towards modeling particular types of languages. Investigation of these biases has proved complicated due to the many variables that appear in the experimental setup. Languages vary in many typological dimensions, and it is difficult to single out one or two to investigate without the others acting as confounders. We propose a novel method for investigating the inductive biases of language models using artificial languages. These languages are constructed to allow us to create parallel corpora across languages that differ only in the typological feature being investigated, such as word order. We then use them to train and test language models. This constitutes a fully controlled causal framework, and demonstrates how grammar engineering can serve as a useful tool for analyzing neural models. Using this method, we find that commonly used neural architectures exhibit different inductive biases: LSTMs display little preference with respect to word ordering, while transformers display a clear preference for some orderings over others. Further, we find that neither the inductive bias of the LSTM nor that of the transformer appears to reflect any tendencies that we see in attested natural languages.
翻译:由于语言模型被用于模拟多种语言,因此自然会问用于这项任务的神经结构是否具有对特定类型语言的诱导偏差。对这些偏差的调查证明由于实验设置中出现的许多变量而变得复杂。语言在类型方面有许多不同层面,很难单独调查一两个,而其他语言则不作为混淆者行事。我们提出了一个调查语言模型的诱导偏差的新颖方法。这些语言的构建是为了使我们能够在所调查的字型特征(例如文字顺序)上产生不同语言的平行连体。我们随后使用这些神经结构来培训和测试语言模型。这构成了一个完全受控制的因果关系框架,并展示了语法工程如何作为分析神经模型的有用工具。我们采用这种方法发现,常用的神经结构表现出不同的感化偏差:LSTMS表示对文字排序的偏好不大,而变异者则表示对一些命令的偏好。此外,我们发现,无论是LSTM的感性偏向偏向性偏向,还是变异者似乎任何自然趋势都表现。