机器翻译的短片学习的不合理效果 (The unreasonable effectiveness of few-shot learning for machine translation)

We demonstrate the potential of few-shot translation systems, trained with unpaired language data, for both high and low-resource language pairs. We show that with only 5 examples of high-quality translation data shown at inference, a transformer decoder-only model trained solely with self-supervised learning, is able to match specialized supervised state-of-the-art models as well as more general commercial translation systems. In particular, we outperform the best performing system on the WMT'21 English - Chinese news translation task by only using five examples of English - Chinese parallel data at inference. Moreover, our approach in building these models does not necessitate joint multilingual training or back-translation, is conceptually simple and shows the potential to extend to the multilingual setting. Furthermore, the resulting models are two orders of magnitude smaller than state-of-the-art language models. We then analyze the factors which impact the performance of few-shot translation systems, and highlight that the quality of the few-shot demonstrations heavily determines the quality of the translations generated by our models. Finally, we show that the few-shot paradigm also provides a way to control certain attributes of the translation -- we show that we are able to control for regional varieties and formality using only a five examples at inference, paving the way towards controllable machine translation systems.

翻译：我们展示了微小的翻译系统的潜力,这些系统在高、低资源语言配对方面都受过未受重视的语言数据培训。我们展示了高、低资源语言配对方面的潜在潜力。我们展示了只有五个高品质翻译数据实例,在推论中显示的只有五个高品质翻译数据实例,一个仅经过自我监督学习培训的变压器解码器单一模型,能够匹配专门监管的先进模型以及更一般的商业翻译系统。我们尤其表现得优于WMT'21年英文-中文新闻翻译工作的最佳运作系统,只使用了五个中英平行数据实例。此外,我们建设这些模型的方法并不要求联合进行多语种培训或回译,而是在概念上简单明了,并显示了推广到多语种环境的潜力。此外,所产生的模型规模比最新语言模式规模小两级,比一般商业翻译系统的效果要小,我们然后分析了影响少数翻译系统工作的因素,并着重指出,少发演示的质量在很大程度上决定了我们模型产生的翻译的质量。最后,我们展示了少发模式的范例也只能提供一种途径,我们只能用来控制某些机器翻译的典型。

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日