Linguistic analysis of language models is one of the ways to explain and describe their reasoning, weaknesses, and limitations. In the probing part of the model interpretability research, studies concern individual languages as well as individual linguistic structures. The question arises: are the detected regularities linguistically coherent, or on the contrary, do they dissonate at the typological scale? Moreover, the majority of studies address the inherent set of languages and linguistic structures, leaving the actual typological diversity knowledge out of scope. In this paper, we present and apply the GUI-assisted framework allowing us to easily probe a massive number of languages for all the morphosyntactic features present in the Universal Dependencies data. We show that reflecting the anglo-centric trend in NLP over the past years, most of the regularities revealed in the mBERT model are typical for the western-European languages. Our framework can be integrated with the existing probing toolboxes, model cards, and leaderboards, allowing practitioners to use and share their standard probing methods to interpret multilingual models. Thus we propose a toolkit to systematize the multilingual flaws in multilingual models, providing a reproducible experimental setup for 104 languages and 80 morphosyntactic features. https://github.com/AIRI-Institute/Probing_framework
翻译:语言模型的语言分析是解释和描述其推理、弱点和局限性的方法之一。在模型解释研究的检验部分,研究涉及个别语言和个别语言结构。出现的问题是:检测到的规律性语言一致性或相反,它们在类型尺度上是否不相容?此外,大多数研究涉及语言和语言结构的固有组合,将实际的字型多样性知识排除在范围之外。在本文中,我们提出并应用由图形用户界面协助的框架,使我们能够方便地为普遍依赖性数据中存在的所有形态合成特征探测大量语言。我们表明,过去几年中反映NLP中以群居为中心的趋势是语言一致性的还是相反的,MBERT模型中显示的多数规律性是西欧语言的典型。我们的框架可以与现有的原始工具箱、模范卡和领头板结合起来,允许从业人员使用和分享其标准的示范方法来解释多语种模式。因此我们提议了一个工具包,将多语种模型的多语种缺陷系统化,提供80/图像模型的模型。