Vocabulary selection, or lexical shortlisting, is a well-known technique to improve latency of Neural Machine Translation models by constraining the set of allowed output words during inference. The chosen set is typically determined by separately trained alignment model parameters, independent of the source-sentence context at inference time. While vocabulary selection appears competitive with respect to automatic quality metrics in prior work, we show that it can fail to select the right set of output words, particularly for semantically non-compositional linguistic phenomena such as idiomatic expressions, leading to reduced translation quality as perceived by humans. Trading off latency for quality by increasing the size of the allowed set is often not an option in real-world scenarios. We propose a model of vocabulary selection, integrated into the neural translation model, that predicts the set of allowed output words from contextualized encoder representations. This restores translation quality of an unconstrained system, as measured by human evaluations on WMT newstest2020 and idiomatic expressions, at an inference latency competitive with alignment-based selection using aggressive thresholds, thereby removing the dependency on separately trained alignment models.
翻译:词汇选择,或词汇短名单,是提高神经机器翻译模型长期性的一种众所周知的方法,在推断期间限制一组允许产出字数,从而改善神经机器翻译模型的延缓性。所选集通常由单独训练的校准模型参数决定,独立于推论时间的源判理环境。虽然在先前工作中,词汇选择在自动质量衡量标准方面似乎具有竞争力,但我们表明,它可能无法选择正确的产出词组,特别是对于诸如单词表达式等非立体语言现象而言,导致人类所认为的翻译质量下降。在现实世界情景中,通过增加允许数据集的尺寸来交换延缓质量往往不是一个选项。我们提出了一个词汇选择模式,该模式将结合到神经翻译模型中,预测从背景化的编码显示中允许的一组输出字组。这恢复了未经控制系统的翻译质量,如对WMT新闻测试2020和单词表达式的人类评价所测量,在使用进取性阈值选择的弹性拉度上,从而消除对单独训练的校准模型的依赖性。