我们要把某一个模型训练到英文上,然后直接测试在其它语言上。MultiBERT 会把不同的语言对齐在一起。不同语言对于 MultiBERT 似乎是无差异的,所以它能在英文上微调,中文上测试。我们有没有办法可以把不同语言信息的差异从MultiBERT中抹除掉,这样跨语言的零样本学习的效果就会变得越来越好。我们可以做一个非常简单的尝试。我们先把MultiBERT在英文上微调,然后测试在中文上。不同的是,我们在输出的时候,再加上刚才那个平均语言差异向量。结果表明,在各式各样不同的语言上,得到的结果都有不同程度的微小提升。当然,目前用的抹掉语言差异的方法还很粗糙。未来可以研究出更好的技术。 视频见(需要梯子):https://www.youtube.com/watch?v=8rDN1jUI82g&feature=youtu.be Reference• [K, et al., ICLR’20] Karthikeyan K, Zihan Wang, Stephen Mayhew, and Dan Roth. Cross-lingual ability of multilingual BERT: An empirical study, ICLR, 2020• [Pires, et al., ACL’19] Telmo Pires, Eva Schlinger, Dan Garrette, How multilingual is Multilingual BERT?, ACL, 2019• [Wu, et al., EMNLP’19] Shijie Wu, Mark Dredze, Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT, EMNLP, 2019• [Hsu, Liu, et al., EMNLP’19] Tsung-Yuan Hsu, Chi-Liang Liu and Hung-yi Lee, "Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model", EMNLP, 2019• [Liu, et al., arXiv’20] Chi-Liang Liu, Tsung-Yuan Hsu, Yung-Sung Chuang, Hung-Yi Lee, A Study of Cross-Lingual Ability and Language-specific Information in Multilingual BERT, arXiv, 2020• [Hu, et al., arXiv’20] Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson, XTREME: A Massively Multilingual Multi- task Benchmark for Evaluating Cross-lingual Generalization, arXiv, 2020• [Libovický, arXiv’20] Jindřich Libovický, Rudolf Rosa, Alexander Fraser, On the Language Neutrality of Pre-trained Multilingual Representations, arXiv, 2020