有时候我们想要翻译 (Sometimes We Want Translationese)

Rapid progress in Neural Machine Translation (NMT) systems over the last few years has been driven primarily towards improving translation quality, and as a secondary focus, improved robustness to input perturbations (e.g. spelling and grammatical mistakes). While performance and robustness are important objectives, by over-focusing on these, we risk overlooking other important properties. In this paper, we draw attention to the fact that for some applications, faithfulness to the original (input) text is important to preserve, even if it means introducing unusual language patterns in the (output) translation. We propose a simple, novel way to quantify whether an NMT system exhibits robustness and faithfulness, focusing on the case of word-order perturbations. We explore a suite of functions to perturb the word order of source sentences without deleting or injecting tokens, and measure the effects on the target side in terms of both robustness and faithfulness. Across several experimental conditions, we observe a strong tendency towards robustness rather than faithfulness. These results allow us to better understand the trade-off between faithfulness and robustness in NMT, and opens up the possibility of developing systems where users have more autonomy and control in selecting which property is best suited for their use case.

翻译：在过去几年里,神经机器翻译系统取得了快速进展,这主要是为了提高翻译质量,并作为次要重点,提高了投入扰动(例如拼写和语法错误)的稳健性。虽然业绩和稳健性是重要的目标,但过度注重这些目标,我们可能会忽略其他重要的特性。在本文件中,我们提请注意这样一个事实,即对于某些应用而言,忠实于原始(投入)文本很重要,即使这意味着在(产出)翻译中引入不寻常的语言模式。我们提出了一个简单、新颖的方法,用以量化NMT系统是否显示稳健和忠实,重点是单词顺序扰动案例。我们探索一套功能,在不删除或注入符号的情况下干扰源句的单词顺序,衡量对目标方的影响。在若干实验条件下,我们观察到一种强烈的稳健趋势,而不是忠实于(产出)翻译。我们提出了一种简单、新颖的方法,用以量化一个NMT系统是否具有稳健和忠实性,侧重于单词顺序的案例中的稳健性和忠诚性。我们探索了一系列功能组合,以便在不删除或注入符号的情况下对用户进行最佳的自主权。