Non-autoregressive (NAR) text generation has attracted much attention in the field of natural language processing, which greatly reduces the inference latency but has to sacrifice the generation accuracy. Recently, diffusion models, a class of latent variable generative models, have been introduced into NAR text generation, showing improved generation quality. In this survey, we review the recent progress in diffusion models for NAR text generation. As the background, we first present the general definition of diffusion models and the text diffusion models, and then discuss their merits for NAR generation. As the core content, we further introduce two mainstream diffusion models in existing text diffusion works, and review the key designs of the diffusion process. Moreover, we discuss the utilization of pre-trained language models (PLMs) for text diffusion models and introduce optimization techniques for text data. Finally, we discuss several promising directions and conclude this paper. Our survey aims to provide researchers with a systematic reference of related research on text diffusion models for NAR generation.
翻译:在自然语言处理领域,非潜移(NAR)的文本生成引起了许多注意,这大大减少了推论时间的延缓,但不得不牺牲生成的准确性。最近,传播模型,即一组潜在的可变基因模型,被引入了NAR文本生成中,显示了更佳的生成质量。在本次调查中,我们审查了NAR文本生成的传播模型的最新进展。作为背景,我们首先介绍传播模型和文本传播模型的一般定义,然后讨论其优点。作为核心内容,我们进一步将两种主流传播模型引入现有文本传播工作,并审查传播过程的关键设计。此外,我们还讨论了将预先培训的语言模型用于文本传播模型,并引入了文本数据优化技术。最后,我们讨论了若干有希望的方向,并完成了这份文件。我们的调查旨在为研究人员提供关于NAR生成文本传播模型相关研究的系统参考。</s>