在神经机器翻译中使用目标-Side物质学信息改进基于特征的脱污 (Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation)

Recently, neural machine translation (NMT) has emerged as a powerful alternative to conventional statistical approaches. However, its performance drops considerably in the presence of morphologically rich languages (MRLs). Neural engines usually fail to tackle the large vocabulary and high out-of-vocabulary (OOV) word rate of MRLs. Therefore, it is not suitable to exploit existing word-based models to translate this set of languages. In this paper, we propose an extension to the state-of-the-art model of Chung et al. (2016), which works at the character level and boosts the decoder with target-side morphological information. In our architecture, an additional morphology table is plugged into the model. Each time the decoder samples from a target vocabulary, the table sends auxiliary signals from the most relevant affixes in order to enrich the decoder's current state and constrain it to provide better predictions. We evaluated our model to translate English into German, Russian, and Turkish as three MRLs and observed significant improvements.

翻译：最近,神经机器翻译(NMT)已成为传统统计方法的有力替代物,但是,在有形态丰富语言的情况下,其性能显著下降。神经引擎通常无法解决MRL的大型词汇和高外词汇(OOOV)字率问题。因此,利用现有的基于字的模型来翻译这组语言是不合适的。在本文件中,我们提议扩展钟等人(Chung等人(2016)的先进模型,该模型在性格水平上运作,用目标方形态信息促进脱coder。在我们的结构中,又插入了一个形态表插进模型中。每次从目标词汇中提取脱coder样本时,该表都会发出来自最相关部分的辅助信号,以便丰富解码器的当前状态,并限制它提供更好的预测。我们评价了我们的模型,将英语翻译成德文、俄文和土耳其文,作为三个MRL,并观察到了显著的改进。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【论文】多语言神经机器翻译综述（A Comprehensive Survey of Multilingual Neural Machine Translation）

专知会员服务

20+阅读 · 2020年1月7日

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

专知会员服务

7+阅读 · 2019年12月19日

【剑桥大学】神经机器翻译综述论文，Neural Machine Translation: A Review，附88页pdf

专知会员服务

37+阅读 · 2019年12月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日