We present a survey covering the state of the art in low-resource machine translation research. There are currently around 7000 languages spoken in the world and almost all language pairs lack significant resources for training machine translation models. There has been increasing interest in research addressing the challenge of producing useful translation models when very little translated training data is available. We present a summary of this topical research field and provide a description of the techniques evaluated by researchers in several recent shared tasks in low-resource MT.
翻译:我们对低资源机器翻译研究的先进程度进行了调查,目前世界上大约有7 000种语言,几乎所有的语文对口都缺乏大量资源来培训机器翻译模型,人们越来越有兴趣研究如何在很少有翻译培训数据的情况下,解决生产有用的翻译模型的挑战,我们总结了这个专题研究领域,并介绍了研究人员在最近一些共享的低资源MT任务中评估的技术。