Text Generation aims to produce plausible and readable text in a human language from input data. The resurgence of deep learning has greatly advanced this field, in particular, with the help of neural generation models based on pre-trained language models (PLMs). Text generation based on PLMs is viewed as a promising approach in both academia and industry. In this paper, we provide a survey on the utilization of PLMs in text generation. We begin with introducing three key aspects of applying PLMs to text generation: 1) how to encode the input into representations preserving input semantics which can be fused into PLMs; 2) how to design an effective PLM to serve as the generation model; and 3) how to effectively optimize PLMs given the reference text and to ensure that the generated texts satisfy special text properties. Then, we show the major challenges arisen in these aspects, as well as possible solutions for them. We also include a summary of various useful resources and typical text generation applications based on PLMs. Finally, we highlight the future research directions which will further improve these PLMs for text generation. This comprehensive survey is intended to help researchers interested in text generation problems to learn the core concepts, the main techniques and the latest developments in this area based on PLMs.
翻译:深层次学习的恢复大大推进了这一领域的工作,特别是在以预先培训的语言模型为基础的神经生成模型的帮助下;基于PLM的生成被视为学术界和工业界的一种有希望的方法;在本文件中,我们对在文本生成中使用PLMs的情况进行了调查;我们首先介绍在文本生成中应用PLMs的三个关键方面:(1) 如何将输入输入纳入保护输入语义的表述中,从而可以融入PLMs;(2) 如何设计有效的PLM作为生成模型;(3) 如何有效优化参考文本中的PLMs,确保生成文本满足特殊文本特性;然后,我们展示这些方面出现的主要挑战,以及可能的解决办法;我们还概要介绍了在文本生成中应用PLMs的各种有用资源和典型文本生成应用。最后,我们强调未来研究方向,这将进一步改进这些PLMs,以便生成文本。这次全面调查的目的是帮助对文本生成问题感兴趣的研究人员学习核心概念、主要技术和磁盘领域的最新发展。