In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.
翻译:本文将ELECTRA式的任务引入跨语言语言模式培训前模式。 具体地说,我们提出了两个培训前任务,即多语种替代物证检测和翻译替代物证检测。 此外,我们在多语种和平行公司方面对称为XLM-E的模型进行了预先培训。我们的模型在多种语言理解任务上比基线模型多得多,而计算成本要低得多。 此外,分析表明,XLM-E往往获得更好的跨语言可转移性。