While large-scale neural language models, such as GPT2 and BART, have achieved impressive results on various text generation tasks, they tend to get stuck in undesirable sentence-level loops with maximization-based decoding algorithms (\textit{e.g.}, greedy search). This phenomenon is counter-intuitive since there are few consecutive sentence-level repetitions in human corpora (e.g., 0.02\% in Wikitext-103). To investigate the underlying reasons for generating consecutive sentence-level repetitions, we study the relationship between the probabilities of the repetitive tokens and their previous repetitions in the context. Through our quantitative experiments, we find that 1) Language models have a preference to repeat the previous sentence; 2) The sentence-level repetitions have a \textit{self-reinforcement effect}: the more times a sentence is repeated in the context, the higher the probability of continuing to generate that sentence; 3) The sentences with higher initial probabilities usually have a stronger self-reinforcement effect. Motivated by our findings, we propose a simple and effective training method \textbf{DITTO} (Pseu\underline{D}o-Repet\underline{IT}ion Penaliza\underline{T}i\underline{O}n), where the model learns to penalize probabilities of sentence-level repetitions from pseudo repetitive data. Although our method is motivated by mitigating repetitions, experiments show that DITTO not only mitigates the repetition issue without sacrificing perplexity, but also achieves better generation quality. Extensive experiments on open-ended text generation (Wikitext-103) and text summarization (CNN/DailyMail) demonstrate the generality and effectiveness of our method.
翻译:虽然大型神经语言模型,如GPT2和BART等,在各种文本生成任务方面取得了令人印象深刻的成果,但它们往往被困在基于最大化的解码算法(\ textit{ 如},贪婪的搜索)不可取的句级循环中。 这种现象是反直觉的,因为在人体骨骼(例如Wikiptext-103中,0.02 ⁇ )中很少有连续的句级重复(例如,Wikiptext-103中,0.02 ⁇ )中,为了调查产生连续的重复性重现的根本原因,我们研究了重复性符号的概率和先前在内容中的重复性之间的关系。我们通过定量实验发现:(1) 语言模型倾向于重复前一句;(2) 句级重复具有一种反直线性;(2) 句级重复性重现在下文中重复次数越多, 继续生成该句的概率越大;(3) 具有较高概率的句级重复性通常具有更强的自我强化效果。 根据我们的调查结果,我们建议的一种简单和有效的培训方法, 降级 降级 降序 降级 降级 降级 降级 降级 降级 降级 降级 降级 降级 降级 降级 降级 降级