DiffuSeq: 带有传播模型的序列生成文本序列序列 (DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models)

Recently, diffusion models have emerged as a new paradigm for generative models. Despite the success in domains using continuous signals such as vision and audio, adapting diffusion models to natural language is difficult due to the discrete nature of text. We tackle this challenge by proposing DiffuSeq: a diffusion model designed for sequence-to-sequence (Seq2Seq) text generation tasks. Upon extensive evaluation over a wide range of Seq2Seq tasks, we find DiffuSeq achieving comparable or even better performance than six established baselines, including a state-of-the-art model that is based on pre-trained language models. Apart from quality, an intriguing property of DiffuSeq is its high diversity during generation, which is desired in many Seq2Seq tasks. We further include a theoretical analysis revealing the connection between DiffuSeq and autoregressive/non-autoregressive models. Bringing together theoretical analysis and empirical evidence, we demonstrate the great potential of diffusion models in complex conditional language generation tasks.

翻译：最近,传播模型已成为基因模型的新范例。尽管在使用视觉和音频等连续信号的领域取得了成功,但由于文本的离散性质,很难将传播模型与自然语言相适应。我们通过提出DiffuSeq(Seq2Seq)的传播模型:一个为序列到序列(Seq2Seq)生成文本的任务设计的传播模型来应对这一挑战。经过对一系列Seq2Seq任务的广泛评价,我们发现DiffuSeq(DiffuSeq)取得了比六个既定基线的可比较甚至更好的业绩,包括一个以预先培训的语言模型为基础的最先进的模型。除了质量外,DiffuSeq(DiffuSeq)令人感兴趣的特性是其代代代期的高度多样性,这是许多Seq2Seq任务所期望的。我们还包括一项理论分析,揭示了Diffuseq(Seq)与自动递增/非递增型模型之间的联系。我们汇集了理论分析和经验证据,我们展示了在复杂的有条件语言生成任务中的传播模型的巨大潜力。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日