Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios. Different from previous works that make use of mutually similar but redundant translation memories~(TMs), we propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence while individually contrastive to each other providing maximal information gains in three phases. First, in TM retrieval phase, we adopt a contrastive retrieval algorithm to avoid redundancy and uninformativeness of similar translation pieces. Second, in memory encoding stage, given a set of TMs we propose a novel Hierarchical Group Attention module to gather both local context of each TM and global context of the whole TM set. Finally, in training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence. Experimental results show that our framework obtains improvements over strong baselines on the benchmark datasets.
翻译:在许多翻译设想中,检索增强的神经机能翻译模型都取得了成功。与以前使用相互类似但冗余的翻译记忆~(TMs)的工程不同,我们提议一个新的检索增强NMT,以模拟在整体上与源句相似的反比检索的翻译记忆,同时在三个阶段对立,提供最大程度的信息收益。首先,在TM检索阶段,我们采用了对比式检索算法,以避免类似翻译的重复和不信息规范。第二,在记忆编码阶段,鉴于我们提出了一套新的TMs,我们提出了一套新颖的等级群注意模块,以收集每个TM和整个TM集的全球背景的本地背景。最后,在培训阶段,引入了一个多TM对比式学习目标阶段,以学习每个TM在目标句上的显著特征。实验结果表明,我们的框架在基准数据集的强基线上取得了改进。