用于跨语言传输的模拟原始翻译 (Modelling Latent Translations for Cross-Lingual Transfer)

While achieving state-of-the-art results in multiple tasks and languages, translation-based cross-lingual transfer is often overlooked in favour of massively multilingual pre-trained encoders. Arguably, this is due to its main limitations: 1) translation errors percolating to the classification phase and 2) the insufficient expressiveness of the maximum-likelihood translation. To remedy this, we propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model, by treating the intermediate translations as a latent random variable. As a result, 1) the neural machine translation system can be fine-tuned with a variant of Minimum Risk Training where the reward is the accuracy of the downstream task classifier. Moreover, 2) multiple samples can be drawn to approximate the expected loss across all possible translations during inference. We evaluate our novel latent translation-based model on a series of multilingual NLU tasks, including commonsense reasoning, paraphrase identification, and natural language inference. We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average, which are even more prominent for low-resource languages (e.g., Haitian Creole). Finally, we carry out in-depth analyses comparing different underlying NMT models and assessing the impact of alternative translations on the downstream performance.

翻译：虽然在多种任务和语言方面实现了最新成果,但翻译为基础的跨语言传输往往被忽略,而偏好于大量多语言的经过培训的事先培训的高级编码器。可以说,这主要是因为其主要局限性:(1) 翻译错误与分类阶段有关,(2) 最大类似翻译的清晰度不足。为了纠正这一点,我们提议了一种新技术,将传统管道(翻译和分类)的两个步骤纳入单一模式,将中间翻译作为潜在的随机变量处理。结果,1)神经机器翻译系统可以与最低风险培训的变式进行微调,其中奖赏是下游任务分类员的准确性。此外,2)可以抽取多种样本,以估计所有可能的翻译在推断过程中的预期损失。我们评估了我们关于多种语言的NLU系列任务的新的潜在翻译模式,包括常识推理、语音识别和自然语言推断。我们报告零镜头和几集的学习设置,平均达到2.7个精准点,其中的奖赏是下游任务分类员的准确性。此外,我们最终对低资源模型和下游分析进行更突出。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

专知会员服务

7+阅读 · 2019年12月19日