We show how to derive state-of-the-art unsupervised neural machine translation systems from generatively pre-trained language models. Our method consists of three steps: few-shot amplification, distillation, and backtranslation. We first use the zero-shot translation ability of large pre-trained language models to generate translations for a small set of unlabeled sentences. We then amplify these zero-shot translations by using them as few-shot demonstrations for sampling a larger synthetic dataset. This dataset is distilled by discarding the few-shot demonstrations and then fine-tuning. During backtranslation, we repeatedly generate translations for a set of inputs and then fine-tune a single language model on both directions of the translation task at once, ensuring cycle-consistency by swapping the roles of gold monotext and generated translations when fine-tuning. By using our method to leverage GPT-3's zero-shot translation capability, we achieve a new state-of-the-art in unsupervised translation on the WMT14 English-French benchmark, attaining a BLEU score of 42.1.
翻译:我们展示如何从经过训练的基因前语言模型中获取最先进的不受监督的神经机器翻译系统。 我们的方法包括三个步骤: 微小的放大、 蒸馏和回译。 我们首先使用大型预先训练的语言模型的零光翻译能力为一小套未贴标签的句子生成译文。 然后我们利用这些零光翻译作为微小的演示来取样一个更大的合成数据集。 这个数据集通过丢弃微小的演示和微调来蒸发。 在回译过程中,我们反复生成一组输入的翻译,然后一次微调翻译任务两个方向的单一语言模型,通过交换黄金单文本的作用确保周期的一致性,并在微调时生成翻译。 我们利用GPT-3的零光翻译能力,在WMT14英文-法文基准上实现未经校准的翻译的新状态,达到42.1的BLEU分。