Transformer based models are the modern work horses for neural machine translation (NMT), reaching state of the art across several benchmarks. Despite their impressive accuracy, we observe a systemic and rudimentary class of errors made by transformer based models with regards to translating from a language that doesn't mark gender on nouns into others that do. We find that even when the surrounding context provides unambiguous evidence of the appropriate grammatical gender marking, no transformer based model we tested was able to accurately gender occupation nouns systematically. We release an evaluation scheme and dataset for measuring the ability of transformer based NMT models to translate gender morphology correctly in unambiguous contexts across syntactically diverse sentences. Our dataset translates from an English source into 20 languages from several different language families. With the availability of this dataset, our hope is that the NMT community can iterate on solutions for this class of especially egregious errors.
翻译:以变换器为基础的模型是神经机翻译的现代工作马(NMT),在几个基准中达到了最先进的水平。尽管这些模型的准确性令人印象深刻,但我们观察到变压器模型在从一种不会在名词上标记性别的语言转换成其他能够翻译的语言方面犯了一系列系统性和初步的错误。我们发现,即使周围环境提供了适当的语法性别标记的明确证据,但我们所测试的没有基于变压器的模型能够系统地准确反映性别职业名词。我们发布了一个评估计划和数据集,用以测量基于变压器的NMT模型在各种组合的毫不含糊的背景下正确翻译性别形态的能力。我们的数据集从一个英语源翻译成来自几个不同语言家庭的20种语言。有了这个数据集,我们希望NMT社区能够就这一特别严重的错误的解决方案进行循环。