Neural machine translation inference procedures like beam search generate the most likely output under the model. This can exacerbate any demographic biases exhibited by the model. We focus on gender bias resulting from systematic errors in grammatical gender translation, which can lead to human referents being misrepresented or misgendered. Most approaches to this problem adjust the training data or the model. By contrast, we experiment with simply adjusting the inference procedure. We experiment with reranking nbest lists using gender features obtained automatically from the source sentence, and applying gender constraints while decoding to improve nbest list gender diversity. We find that a combination of these techniques allows large gains in WinoMT accuracy without requiring additional bilingual data or an additional NMT model.
翻译:光束搜索等神经机翻译推断程序最有可能产生模型下的产出。这可能会加剧模型显示的任何人口偏差。我们注重因语法性别翻译系统错误而产生的性别偏差,这可能导致人类参考人被歪曲或错误性别观念。大多数解决问题的方法都调整培训数据或模型。相比之下,我们尝试仅仅调整推理程序。我们尝试利用从源句中自动获得的性别特征来重新排列最佳名单,并运用性别限制来提高性别多样性。我们发现,这些技术的结合使得WinoMT的准确性大增,而不需要额外的双语数据或额外的NMT模型。