In recent years, pre-trained multilingual language models, such as multilingual BERT and XLM-R, exhibit good performance on zero-shot cross-lingual transfer learning. However, since their multilingual contextual embedding spaces for different languages are not perfectly aligned, the difference between representations of different languages might cause zero-shot cross-lingual transfer failed in some cases. In this work, we draw connections between those failed cases and adversarial examples. We then propose to use robust training methods to train a robust model that can tolerate some noise in input embeddings. We study two widely used robust training methods: adversarial training and randomized smoothing. The experimental results demonstrate that robust training can improve zero-shot cross-lingual transfer for text classification. The performance improvements become significant when the distance between the source language and the target language increases.
翻译:近年来,如多语种BERT和XLM-R等经过预先培训的多语言模式在零点跨语言传输学习方面表现良好,但是,由于不同语言的多语种背景嵌入空间不完全一致,不同语言的表述差异在某些情况下可能导致零点跨语言传输失败。在这项工作中,我们将这些失败的案例与对抗性实例联系起来。然后我们提议使用强有力的培训方法来培训一个能够容忍输入嵌入中某些噪音的强健模式。我们研究了两种广泛使用的强健培训方法:对抗培训和随机平滑。实验结果显示,强健的培训可以改善文本分类的零点跨语言传输。当源语言和目标语言的距离增加时,业绩的改善变得显著。