Due to exposure bias, most existing natural language generation (NLG) models trained by maximizing the likelihood objective predict poor text results during the inference stage. In this paper, to tackle this problem, we revisit the generate-then-rank framework and propose a joint generator-ranker (JGR) training algorithm for text generation tasks. In JGR, the generator model is trained by maximizing two objectives: the likelihood of the training corpus and the expected reward given by the ranker model. Meanwhile, the ranker model takes input samples from the generator model and learns to distinguish good samples from the generation pool. The generator and ranker models are alternately optimized till convergence. In the empirical study, the proposed JGR model achieves new state-of-the-art performance on five public benchmarks covering three popular generation tasks: summarization, question generation, and response generation. We will make code, data, and models available at https://github.com/microsoft/AdvNLG.
翻译:由于接触偏差,大多数现有的自然语言生成模型(NLG)经过培训,通过最大限度地提高可能性目标,在推断阶段预测文本结果不佳。在本文件中,为了解决这一问题,我们重新审视生成-先位框架,并提出用于生成文本任务的联合发电机-排序(JGR)培训算法。在JGR中,生成模型通过最大限度地实现两个目标来培训:培训的可能性和定级模型给予的预期奖赏。与此同时,排名模型从生成模型中提取输入样本,并学习将好样品与生成池区分开来。生成器和排位模型在交汇前相互优化。在实证研究中,拟议的JGR模型在涵盖三种流行一代任务(总和、问题生成和响应生成)的五个公共基准上实现了新的最新业绩。我们将在https://github.com/microsoft/AdvNLG上提供代码、数据和模型。