DiffuSIA: A Spiral Interaction Architecture for Encoder-Decoder Text Diffusion - 专知论文

会员服务 ·

0

SPIRAL · INTERACT · INFORMS · MoDELS · Processing（编程语言） ·

2023 年 5 月 19 日

DiffuSIA: A Spiral Interaction Architecture for Encoder-Decoder Text Diffusion

翻译：暂无翻译

Chao-Hong Tan,Jia-Chen Gu,Zhen-Hua Ling

from arxiv, Work in Progress

Diffusion models have emerged as the new state-of-the-art family of deep generative models, and their promising potentials for text generation have recently attracted increasing attention. Existing studies mostly adopt a single encoder architecture with partially noising processes for conditional text generation, but its degree of flexibility for conditional modeling is limited. In fact, the encoder-decoder architecture is naturally more flexible for its detachable encoder and decoder modules, which is extensible to multilingual and multimodal generation tasks for conditions and target texts. However, the encoding process of conditional texts lacks the understanding of target texts. To this end, a spiral interaction architecture for encoder-decoder text diffusion (DiffuSIA) is proposed. Concretely, the conditional information from encoder is designed to be captured by the diffusion decoder, while the target information from decoder is designed to be captured by the conditional encoder. These two types of information flow run through multilayer interaction spirally for deep fusion and understanding. DiffuSIA is evaluated on four text generation tasks, including paraphrase, text simplification, question generation, and open-domain dialogue generation. Experimental results show that DiffuSIA achieves competitive performance among previous methods on all four tasks, demonstrating the effectiveness and generalization ability of the proposed method.

翻译：暂无翻译

0

相关内容

SPIRAL

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

专知会员服务

85+阅读 · 2023年6月19日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

基于波导结构Cherenkov相位匹配的内调制THz辐射源

国家自然科学基金

0+阅读 · 2014年12月31日

硝酸盐对中华大蟾蜍母源性和内源性甲状腺激素干扰效应及其作用机理的研究

国家自然科学基金

0+阅读 · 2014年12月31日

选择性生物相容纳米表面增强拉曼基底的制备及其小动物无损分子影像的应用

国家自然科学基金

0+阅读 · 2013年12月31日

ST2蛋白抑制胃癌腹膜转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

血红素加氧酶-1（HO-1）在胃癌腹膜转移中的作用研究

国家自然科学基金

0+阅读 · 2008年12月31日

A Query Language for Software Architecture Information (Extended version)

A Query Language for Software Architecture Information (Extended version)

Arxiv

0+阅读 · 2023年7月4日

Diffusion Models: A Comprehensive Survey of Methods and Applications

Arxiv

67+阅读 · 2022年9月2日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

CVPR 2023开会了！谷歌等最新《视觉上理解和解释注意力》教程，附152页ppt

专知会员服务

85+阅读 · 2023年6月19日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ACML2025教程】迈向鲁棒且可信的大语言模型：问题与缓解策略

《利用人工智能改善军事警察行动：当下现状探索》最新95页报告

Google《AI智能体企业应用手册报告》，46页pdf

面向现代武装力量的高级AI驱动军事模拟与训练软件

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

A Query Language for Software Architecture Information (Extended version)

A Query Language for Software Architecture Information (Extended version)

Arxiv

0+阅读 · 2023年7月4日

Diffusion Models: A Comprehensive Survey of Methods and Applications

Arxiv

67+阅读 · 2022年9月2日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

相关基金

基于波导结构Cherenkov相位匹配的内调制THz辐射源

国家自然科学基金

0+阅读 · 2014年12月31日

硝酸盐对中华大蟾蜍母源性和内源性甲状腺激素干扰效应及其作用机理的研究

国家自然科学基金

0+阅读 · 2014年12月31日

选择性生物相容纳米表面增强拉曼基底的制备及其小动物无损分子影像的应用

国家自然科学基金

0+阅读 · 2013年12月31日

ST2蛋白抑制胃癌腹膜转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

血红素加氧酶-1（HO-1）在胃癌腹膜转移中的作用研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员