We study the task of long-form opinion text generation, which faces at least two distinct challenges. First, existing neural generation models fall short of coherence, thus requiring efficient content planning. Second, diverse types of information are needed to guide the generator to cover both subjective and objective content. To this end, we propose DYPLOC, a generation framework that conducts dynamic planning of content while generating the output based on a novel design of mixed language models. To enrich the generation with diverse content, we further propose to use large pre-trained models to predict relevant concepts and to generate claims. We experiment with two challenging tasks on newly collected datasets: (1) argument generation with Reddit ChangeMyView, and (2) writing articles using New York Times' Opinion section. Automatic evaluation shows that our model significantly outperforms competitive comparisons. Human judges further confirm that our generations are more coherent with richer content.
翻译:我们研究长效意见文本生成的任务,这至少面临两个截然不同的挑战。首先,现有的神经生成模型缺乏一致性,因此需要高效的内容规划。第二,需要不同类型的信息来指导生成者涵盖主观和客观内容。为此,我们建议DYPLOC,这是一个能够动态地规划内容,同时根据混合语言模式的新设计产生产出的一代框架。为了丰富生成内容,我们进一步提议使用大型预先培训的模型来预测相关概念和生成主张。我们在新收集的数据集上试验两项具有挑战性的任务:(1) 与Reddit ChangeMyView一起生成争议,以及(2) 使用纽约时报的《意见》部分撰写文章。自动评估表明,我们的模型大大优于竞争性比较。人类法官进一步证实,我们的世代与内容更加丰富。