In NMT we search for the mode of the model distribution to form predictions. The mode and other high-probability translations found by beam search have been shown to often be inadequate in a number of ways. This prevents improving translation quality through better search, as these idiosyncratic translations end up selected by the decoding algorithm, a problem known as the beam search curse. Recently, an approximation to minimum Bayes risk (MBR) decoding has been proposed as an alternative decision rule that would likely not suffer from the same problems. We analyse this approximation and establish that it has no equivalent to the beam search curse. We then design approximations that decouple the cost of exploration from the cost of robust estimation of expected utility. This allows for much larger hypothesis spaces, which we show to be beneficial. We also show that mode-seeking strategies can aid in constructing compact sets of promising hypotheses and that MBR is effective in identifying good translations in them. We conduct experiments on three language pairs varying in amounts of resources available: English into and from German, Romanian, and Nepali.
翻译:在NMT中,我们搜索模型分布模式以形成预测。 通过波束搜索发现的模式和其他高概率翻译往往在很多方面都不够充分。这妨碍了通过更好的搜索来提高翻译质量,因为这些奇特的翻译最终是由解码算法所选择的,这个问题被称为光束搜索诅咒。最近,提出了一种接近最小贝叶风险(MBR)解码的替代决定规则,该规则可能不会受到同样的问题的影响。我们分析这一近似法,确定它不等同于横梁搜索诅咒。然后我们设计近似法,将勘探成本与对预期效用的可靠估计成本相提并论。这样可以创造更大的假设空间,我们证明这样做是有益的。我们还表明,寻求模式的战略可以帮助构建有希望的假设的契约系列,而且MBR在确定这些假设的正确翻译方面是有效的。我们实验了三种不同的语言:英语对德文、罗马尼亚文和尼泊尔文。