Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts. To elucidate the mechanisms by which the models accomplish this behavior, this study applies causal mediation analysis to pre-trained neural language models. We investigate the magnitude of models' preferences for grammatical inflections, as well as whether neurons process subject-verb agreement similarly across sentences with different syntactic structures. We uncover similarities and differences across architectures and model sizes -- notably, that larger models do not necessarily learn stronger preferences. We also observe two distinct mechanisms for producing subject-verb agreement depending on the syntactic structure of the input sentence. Finally, we find that language models rely on similar sets of neurons when given sentences with similar syntactic structure.
翻译:有针对性的综合评估表明语言模型在困难的情况下有能力执行主题动词协议。为了阐明模型完成这一行为的机制,本研究将因果调解分析应用于培训前神经语言模型。我们调查模型对语法反射的偏好程度,以及神经元处理主题动词协议在与不同合成结构的句子之间是否类似。我们发现不同结构和模型大小的相似和不同之处 -- -- 特别是较大的模型不一定学会更强烈的偏好。我们还观察到根据输入句的合成结构制作主题动词协议的两个不同机制。最后,我们发现语言模型在判刑时依赖类似的神经组。