Naturality of long-term information structure -- coherence -- remains a challenge in language generation. Large language models have insufficiently learned such structure, as their long-form generations differ from natural text in measures of coherence. To alleviate this divergence, we propose coherence boosting, an inference procedure that increases the effect of distant context on next-token prediction. We show the benefits of coherence boosting with pretrained models by distributional analyses of generated ordinary text and dialog responses. We also find that coherence boosting with state-of-the-art models for various zero-shot NLP tasks yields performance gains with no additional training.
翻译:长期信息结构的自然性 -- -- 一致性 -- -- 仍然是语言生成中的一项挑战。大型语言模型没有充分了解这种结构,因为其长代人与自然文本在衡量一致性方面的差异不同。为了减少这种差异,我们建议采用一致性促进程序,即一种推论程序,增加远处对下方预测的影响。我们通过对生成的普通文本和对话反应进行分布分析,展示了以预先培训的模式促进一致性的好处。我们还发现,与最先进的模式促进一致性,对各种零弹射 NLP 任务采用最先进的模式,在没有额外培训的情况下产生绩效收益。