Though nearest neighbor Machine Translation ($k$NN-MT) \citep{khandelwal2020nearest} has proved to introduce significant performance boosts over standard neural MT systems, it is prohibitively slow since it uses the entire reference corpus as the datastore for the nearest neighbor search. This means each step for each beam in the beam search has to search over the entire reference corpus. $k$NN-MT is thus two-orders slower than vanilla MT models, making it hard to be applied to real-world applications, especially online services. In this work, we propose Fast $k$NN-MT to address this issue. Fast $k$NN-MT constructs a significantly smaller datastore for the nearest neighbor search: for each word in a source sentence, Fast $k$NN-MT first selects its nearest token-level neighbors, which is limited to tokens that are the same as the query token. Then at each decoding step, in contrast to using the entire corpus as the datastore, the search space is limited to target tokens corresponding to the previously selected reference source tokens. This strategy avoids search through the whole datastore for nearest neighbors and drastically improves decoding efficiency. Without loss of performance, Fast $k$NN-MT is two-orders faster than $k$NN-MT, and is only two times slower than the standard NMT model. Fast $k$NN-MT enables the practical use of $k$NN-MT systems in real-world MT applications. The code is available at \url{https://github.com/ShannonAI/fast-knn-nmt}
翻译:虽然最近的近邻机器翻译 (KNNN-MT) \ citep{khandelwal2020earest} 已证明在标准神经MT系统上引入了显著的性能提升, 但它非常缓慢, 因为它使用整个参考文件作为最近的邻居搜索的数据储存。 这意味着在光束搜索中的每条光束每步都必须搜索整个参考文件。 $k$NNN- MT比香草 MT 模型慢两步, 使得它很难应用于真实世界应用程序, 特别是在线服务。 在这项工作中, 我们提议快速用$KNNNNT- MT 来解决这个问题。 快速 $KNNNNNN- MT为最近的邻居搜索建造一个小得多的数据储存站: 对于源句中的每一字, 快速$k$NNNNM-MT首先选择其最接近的象征性邻居。 仅限与查询符一样的代号。 然后在每次解码步骤中, 与使用整个星体相比,, 搜索空间仅限于与先前选定的参考文件来源的以$$NNNNNNNNNF$美元相对较慢的标 。 在最短的快速的运行中, 标准中, 战略避免搜索整个的运行中, 。 标准的运行中, 。