Context-aware neural machine translation aims to use the document-level context to improve translation quality. However, not all words in the context are helpful. The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context. To mitigate this problem, we propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context. To verify the effectiveness of our method, extensive experiments and extra quantitative analysis are conducted on four document-level machine translation benchmarks. The experimental results demonstrate that our model significantly outperforms previous models on all datasets via the soft selection mechanism.
翻译:环境觉悟神经机能翻译的目的是利用文件层面的背景来提高翻译质量,但并非所有的文字都有用,无关或微不足道的文字可能会带来一些噪音,使模型偏离当前句子和辅助语系之间的关系。为缓解这一问题,我们提议采用新的端到端编码-编码解码器模型,采用分层选择机制筛选和完善长的文件背景。为了核实我们的方法的有效性,对四个文件层面的机器翻译基准进行了广泛的试验和额外的定量分析。实验结果显示,我们的模型通过软选择机制大大优于以往所有数据集的模型。