In this paper, we study the task of improving the cohesion and coherence of long-form text generated by language models. To this end, we propose RSTGen, a framework that utilises Rhetorical Structure Theory (RST), a classical language theory, to control the discourse structure, semantics and topics of generated text. Firstly, we demonstrate our model's ability to control structural discourse and semantic features of generated text in open generation evaluation. Then we experiment on the two challenging long-form text tasks of argument generation and story generation. Evaluation using automated metrics and a metric with high correlation to human evaluation, shows that our model performs competitively against existing models, while offering significantly more controls over generated text than alternative methods.
翻译:在本文中,我们研究了提高语言模型产生的长式文本的凝聚力和一致性的任务,为此,我们提议使用RSTGen,这是一个使用风力结构理论(RST)的框架,一个古典语言理论,以控制话语结构、语义和生成文本的主题。首先,我们展示了我们的模型在公开生成评估中控制结构话语和生成文本的语义特征的能力。然后,我们试验了两个具有挑战性的长式文本任务,即产生争论和生成故事。使用自动度量度和与人类评估高度关联的度量进行评估,表明我们的模型与现有模型竞争,同时对生成文本的控制权远远大于替代方法。