Large pre-trained language models have shown promising results in a wide array of tasks such as narrative generation, question answering, and machine translation. Likewise, the current trend in literature has deeply focused on controlling salient properties of generated texts including sentiment, topic, and coherence to produce more human-like outputs. In this work, we introduce Uniform Complexity for Text Generation or UCTG which serves as a challenge to make existing models generate uniformly complex text with respect to inputs or prompts used. For example, if the reading level of an input text prompt is appropriate for low-leveled learners (ex. A2 in the CEFR), then the generated text by an NLG system should also assume this particular level for increased readability. In a controlled narrative generation task, we surveyed over 160 linguistic and cognitively-motivated features for evaluating text readability and found out that GPT-2 models and even humans struggle in preserving the linguistic complexity of input prompts used. Ultimately, we lay down potential methods and approaches which can be incorporated into the general framework of steering language models towards addressing this important challenge.
翻译:受过培训的大型语言模型在诸如叙事生成、问答和机器翻译等广泛任务中显示出了有希望的成果。同样,文献目前的趋势也非常侧重于控制生成文本的显著特性,包括情绪、主题和一致性,以产生更人性化的产出。在这项工作中,我们引入了“文本生成统一复杂度”或“UCTG ”,作为使现有模型在投入或所用提示方面产生统一复杂的文本的挑战。例如,如果对低级别学习者来说,输入提示的读数水平适合低级别学习者(如CEFR中的A2),那么由NLG系统生成的文本也应该承担这一特殊水平,以提高可读性。在受控的叙述生成任务中,我们调查了160多个语言和认知特性,以评估文本可读性,发现GPT-2模型,甚至人类也在维护所使用输入提示的语言复杂性方面挣扎。最后,我们提出了可以纳入指导语言模型应对这一重要挑战的总体框架的潜在方法和办法。