Non-autoregressive (NAR) models can generate sentences with less computation than autoregressive models but sacrifice generation quality. Previous studies addressed this issue through iterative decoding. This study proposes using nearest neighbors as the initial state of an NAR decoder and editing them iteratively. We present a novel training strategy to learn the edit operations on neighbors to improve NAR text generation. Experimental results show that the proposed method (NeighborEdit) achieves higher translation quality (1.69 points higher than the vanilla Transformer) with fewer decoding iterations (one-eighteenth fewer iterations) on the JRC-Acquis En-De dataset, the common benchmark dataset for machine translation using nearest neighbors. We also confirm the effectiveness of the proposed method on a data-to-text task (WikiBio). In addition, the proposed method outperforms an NAR baseline on the WMT'14 En-De dataset. We also report analysis on neighbor examples used in the proposed method.
翻译:与自动递减模型相比,非自动递减模型(NAR)模型可以产生比自动递减模型少的句子,但具有牺牲生成质量。 先前的研究通过迭代解码来解决这个问题。 本研究建议使用最近的邻居作为NAR解码器的初始状态,并对其进行迭代。 我们提出了一个新颖的培训战略来学习邻居的编辑操作,以改善NAR的文本生成。 实验结果显示,拟议方法(NeighborEdit)的翻译质量比香草变异器高1. 69个百分点,在JRC-Acquis En-De数据集(使用最近的邻居的机器翻译通用基准数据集)上的解码迭代数较少(再少一十八个迭代数 ) 。 我们还确认了数据到文本任务的拟议方法(WikiBio)的有效性。 此外,拟议方法比WMT' 14 E- De数据集的NAR基线高出1.69个百分点。 我们还报告了对拟议方法中使用的邻居实例的分析。