Nearest Neighbor Machine Translation (kNNMT) is a simple and effective method of augmenting neural machine translation (NMT) with a token-level nearest neighbor retrieval mechanism. The effectiveness of kNNMT directly depends on the quality of retrieved neighbors. However, original kNNMT builds datastores based on representations from NMT models, which would result in poor retrieval accuracy when NMT models are not good enough, leading to sub-optimal translation performance. In this paper, we propose PRED, a framework that leverages Pre-trained models for Datastores in kNN-MT. Better representations from pre-trained models allow us to build datastores of better quality. We also design a novel contrastive alignment objective to mitigate the representation gap between the NMT model and pre-trained models, enabling the NMT model to retrieve from better datastores. We conduct extensive experiments on both bilingual and multilingual translation benchmarks, including WMT17 English $\leftrightarrow$ Chinese, WMT14 English $\leftrightarrow$ German, IWSLT14 German $\leftrightarrow$ English, and IWSLT14 multilingual datasets. Empirical results demonstrate the effectiveness of PRED.
翻译:近邻机器翻译( kNNNMT) 是一个简单而有效的增强神经机器翻译( NMT) 的方法, 并有一个象征性的近邻检索机制。 kNNMT 的有效性直接取决于回收邻居的质量 。 但是, 原始 kNNMT 根据NMT 模型的表示方式建立数据储存, 这将使NMT模型不够好时检索准确性差, 导致亚最佳翻译性能 。 在本文中, 我们提议 PRED, 这个框架利用了 kNNN- MT 中的数据存储器的预培训模型。 预培训模型的更好表现让我们得以建立质量更高的数据储存器。 我们还设计了一个新的对比性调整目标, 以缩小NMT模型和预培训模型之间的代表差距, 使NMT模型能够从更好的数据储存处检索数据。 我们在双语和多语种翻译基准上进行了广泛的实验, 包括WMT17 $\ leftrightrow$ 中文、 WMT14 $leftrightrowroom 德文、 IWSLightrightrow $ 英文和 IWSLT14 多语言数据集的实效。