Recently, contrastive learning attracts increasing interests in neural text generation as a new solution to alleviate the exposure bias problem. It introduces a sequence-level training signal which is crucial to generation tasks that always rely on auto-regressive decoding. However, previous methods using contrastive learning in neural text generation usually lead to inferior performance. In this paper, we analyse the underlying reasons and propose a new Contrastive Neural Text generation framework, CoNT. CoNT addresses bottlenecks that prevent contrastive learning from being widely adopted in generation tasks from three aspects -- the construction of contrastive examples, the choice of the contrastive loss, and the strategy in decoding. We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation. Experimental results show that CoNT clearly outperforms the conventional training framework on all the ten benchmarks with a convincing margin. Especially, CoNT surpasses previous the most competitive contrastive learning method for text generation, by 1.50 BLEU on machine translation and 1.77 ROUGE-1 on summarization, respectively. It achieves new state-of-the-art on summarization, code comment generation (without external data) and data-to-text generation.
翻译:最近,对比式学习吸引了人们对神经文本生成的兴趣,作为减轻暴露偏差问题的新解决办法,吸引了对神经文本生成的兴趣。它引入了序列级培训信号,对于始终依赖自动递增解码的生成任务至关重要。然而,在神经文本生成过程中使用对比式学习的以往方法通常导致低效。在本文中,我们分析基本原因并提出新的对比性神经文本生成框架。 CoNT解决了阻碍对比性学习被广泛用于从三个方面 -- -- 建立对比性实例、选择对比性损失以及解码战略 -- -- 的瓶颈问题。我们验证了五种生成任务,有十项基准,包括机器翻译、合成、代码评论生成、数据对文本生成和普通生成。实验结果表明,CONT明显超越了所有十项基准的常规培训框架,并有令人信服的差幅。尤其是,CONT超越了以往在生成文本过程中最有竞争力的对比性学习方法,在机器翻译方面采用了1.50 BLEUEU, 和1.77 ROUGE-1的解码化战略。我们验证了五代代任务,分别实现了机器翻译、制成外部数据。