We investigate the extent to which modern, neural language models are susceptible to structural priming, the phenomenon whereby the structure of a sentence makes the same structure more probable in a follow-up sentence. We explore how priming can be used to study the potential of these models to learn abstract structural information, which is a prerequisite for good performance on tasks that require natural language understanding skills. We introduce a novel metric and release Prime-LM, a large corpus where we control for various linguistic factors which interact with priming strength. We find that Transformer models indeed show evidence of structural priming, but also that the generalisations they learned are to some extent modulated by semantic information. Our experiments also show that the representations acquired by the models may not only encode abstract sequential structure but involve certain level of hierarchical syntactic information. More generally, our study shows that the priming paradigm is a useful, additional tool for gaining insights into the capacities of language models and opens the door to future priming-based investigations that probe the model's internal states.
翻译:我们调查现代神经语言模型在多大程度上容易形成结构边缘,即一个句子结构使同一结构更有可能成为后续句子中的一种现象。我们探索如何利用这些模型的潜力来学习抽象的结构信息,这是在需要自然语言理解技能的任务上良好表现的先决条件。我们引入了一个新的衡量标准并释放了Prime-LM,这是我们控制与边缘力量相互作用的各种语言因素的一个大工具。我们发现,变形模型确实显示了结构边缘的证据,但也发现它们所学到的一般特征在某种程度上是由语义信息调节的。我们的实验还表明,这些模型获得的表述不仅可以将抽象的顺序结构编码起来,而且还涉及某种等级合成信息。更一般地说,我们的研究显示,这种转变模式是一种有用的额外工具,可以深入了解语言模型的能力,并为未来调查模型内部状态的边缘调查打开了大门。