Back translation is one of the most widely used methods for improving the performance of neural machine translation systems. Recent research has sought to enhance the effectiveness of this method by increasing the 'diversity' of the generated translations. We argue that the definitions and metrics used to quantify 'diversity' in previous work have been insufficient. This work puts forward a more nuanced framework for understanding diversity in training data, splitting it into lexical diversity and syntactic diversity. We present novel metrics for measuring these different aspects of diversity and carry out empirical analysis into the effect of these types of diversity on final neural machine translation model performance for low-resource English$\leftrightarrow$Turkish and mid-resource English$\leftrightarrow$Icelandic. Our findings show that generating back translation using nucleus sampling results in higher final model performance, and that this method of generation has high levels of both lexical and syntactic diversity. We also find evidence that lexical diversity is more important than syntactic for back translation performance.
翻译:返回翻译是用来改善神经机器翻译系统性能的最广泛使用的方法之一。 最近的研究试图通过增加所生成翻译的“多样性”来提高这一方法的有效性。 我们争辩说,以前工作中用于量化“多样性”的定义和衡量尺度不够充分。 这项工作提出了一个更细致的框架,以了解培训数据的多样性,将其分为词汇多样性和合成多样性。 我们提出了衡量多样性这些不同方面的新指标,并对这些类型多样性对低资源英文/左翼列曲$土耳其元和中资源英文/左翼列曲$冰岛元的最后神经机器翻译模型性能的影响进行了经验分析。 我们的研究结果显示,在最后模型性能较高的情况下,利用核取样结果生成回译,这种生成方法在词汇和合成多样性方面都具有很高程度。 我们还发现,词汇多样性比反翻译性能更为重要的证据。