Neural Machine Translation models are sensitive to noise in the input texts, such as misspelled words and ungrammatical constructions. Existing robustness techniques generally fail when faced with unseen types of noise and their performance degrades on clean texts. In this paper, we focus on three types of realistic noise that are commonly generated by humans and introduce the idea of visual context to improve translation robustness for noisy texts. In addition, we describe a novel error correction training regime that can be used as an auxiliary task to further improve translation robustness. Experiments on English-French and English-German translation show that both multimodal and error correction components improve model robustness to noisy texts, while still retaining translation quality on clean texts.
翻译:神经机器翻译模型对输入文本中的噪音十分敏感,例如拼错字和不语法构造。现有稳健技术在面对隐蔽类型的噪音时一般会失败,其性能会降低清洁文本。在本文中,我们侧重于人类通常产生的三种现实的噪音,并引入视觉环境概念,以提高噪音文本翻译的稳健性。此外,我们描述了一个新的错误纠正培训制度,可以用作进一步提高翻译稳健性的辅助任务。英语-法语和英语-德语翻译实验显示,多式联运和错误校正部分都提高了对吵闹文本的模型稳健性,同时仍然保留了清洁文本的翻译质量。