The outstanding performance of transformer-based language models on a great variety of NLP and NLU tasks has stimulated interest in exploring their inner workings. Recent research has focused primarily on higher-level and complex linguistic phenomena such as syntax, semantics, world knowledge, and common sense. The majority of the studies are anglocentric, and little remains known regarding other languages, precisely their morphosyntactic properties. To this end, our work presents Morph Call, a suite of 46 probing tasks for four Indo-European languages of different morphology: English, French, German and Russian. We propose a new type of probing task based on the detection of guided sentence perturbations. We use a combination of neuron-, layer- and representation-level introspection techniques to analyze the morphosyntactic content of four multilingual transformers, including their less explored distilled versions. Besides, we examine how fine-tuning for POS-tagging affects the model knowledge. The results show that fine-tuning can improve and decrease the probing performance and change how morphosyntactic knowledge is distributed across the model. The code and data are publicly available, and we hope to fill the gaps in the less studied aspect of transformers.
翻译:以变压器为基础的语言模型在各种国家语言平台和国家语言平台任务中的杰出表现激发了人们探索其内部工作的兴趣。最近的研究主要侧重于高层次和复杂的语言现象,如语法、语义学、世界知识和常识。大多数研究都是以球心为中心,对其他语言,确切地说,其形态合成特性却鲜为人知。为此,我们的工作介绍了Morph Call,这是针对四种不同形态的印度-欧洲语言(英语、法语、德语和俄语)的46种测试任务的套件。我们建议了一种基于检测有指导的句形突变的新型探测任务。我们使用神经、层和代表性层面的演化技术组合来分析四种多语种变异体的内容,包括它们探索较少的变异体特性。此外,我们研究了对POS调压动学模型知识的微调能如何影响模型知识。结果显示,微调可以改进和减少演化性能,并改变我们现有变形知识的模型和变形知识是如何在模型中传播的。