Factored neural machine translation (FNMT) is founded on the idea of using the morphological and grammatical decomposition of the words (factors) at the output side of the neural network. This architecture addresses two well-known problems occurring in MT, namely the size of target language vocabulary and the number of unknown tokens produced in the translation. FNMT system is designed to manage larger vocabulary and reduce the training time (for systems with equivalent target language vocabulary size). Moreover, we can produce grammatically correct words that are not part of the vocabulary. FNMT model is evaluated on IWSLT'15 English to French task and compared to the baseline word-based and BPE-based NMT systems. Promising qualitative and quantitative results (in terms of BLEU and METEOR) are reported.
翻译:因素神经机翻译(NFMT)是基于在神经网络输出方使用词(因素)的形态和语法分解的理念,这一结构处理在MT中出现的两个众所周知的问题,即目标语言词汇的大小和翻译中产生的未知符号的数量。FNMT系统旨在管理更大的词汇和缩短培训时间(对于具有同等目标语言词汇大小的系统)。此外,我们可以生成不属于词汇中的语法正确字词。FNMT模型用IWSLT'15英语对法语任务进行评估,并与基准单词和基于BPE的NMT系统进行比较。报告预测质量和数量结果(BLEU和METEOR)。