Multilingual language models were shown to allow for nontrivial transfer across scripts and languages. In this work, we study the structure of the internal representations that enable this transfer. We focus on the representation of gender distinctions as a practical case study, and examine the extent to which the gender concept is encoded in shared subspaces across different languages. Our analysis shows that gender representations consist of several prominent components that are shared across languages, alongside language-specific components. The existence of language-independent and language-specific components provides an explanation for an intriguing empirical observation we make: while gender classification transfers well across languages, interventions for gender removal, trained on a single language, do not transfer easily to others.
翻译:在这项工作中,我们研究了促成这种转移的内部代表结构;我们注重将性别区分作为实际案例研究,并审查了性别概念在各种语文共享的分空间中编码的程度;我们的分析表明,性别代表由不同语文之间共享的若干突出组成部分以及语言特定组成部分组成;独立语言和特定语言组成部分的存在,为我们令人感兴趣的实证观察提供了解释:虽然性别分类在各种语文之间有很大的转移,但性别分类的干预措施、受过单一语言培训的性别删除措施不易转让给其他语文。