用于对话建模的状态记忆增强变换器 (Stateful Memory-Augmented Transformers for Dialogue Modeling) - 专知论文

会员服务 ·

0

任务对话系统 · 变换 · MoDELS · Performer · 语言模型化 ·

2022 年 9 月 15 日

Stateful Memory-Augmented Transformers for Dialogue Modeling

翻译：用于对话建模的状态记忆增强变换器

Qingyang Wu,Zhou Yu

Transformer encoder-decoder models have shown impressive performance in dialogue modeling. However, as Transformers are inefficient in processing long sequences, dialogue history length often needs to be truncated. To address this problem, we propose a new memory-augmented Transformer that is compatible with existing pre-trained encoder-decoder models and enables efficient preservation of history information. It incorporates a separate memory module alongside the pre-trained Transformer to effectively interchange information between the memory states and the current input context. We evaluate our model on three dialogue datasets and two language modeling datasets. Experimental results show that our method has achieved superior efficiency and performance compared to other pre-trained Transformer baselines.

翻译：变换器编码器- 解码器模型在对话模型中表现出令人印象深刻的性能。但是,由于变换器在处理长序列方面效率低下,对话历史长度往往需要缩短。为了解决这个问题,我们提议了一个新的内存强化变换器,该变换器与现有的预先训练的编码器- 解码器模型兼容,并能够有效地保存历史信息。它包含一个单独的内存模块,与预先训练的变换器一起,在记忆状态和当前输入环境之间有效交换信息。我们评估了三个对话数据集和两个语言模型的模型。实验结果表明,我们的方法比其他经过训练的变换器基线提高了效率和性能。

0

相关内容

任务对话系统

任务对话系统

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【CIKM2019 Tutorial】Recommendation for Multi-Stakeholders and through Neural Review Mining，附158页PDF免费下载

【CIKM2019 Tutorial】Recommendation for Multi-Stakeholders and through Neural Review Mining，附158页PDF免费下载

专知会员服务

21+阅读 · 2019年11月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

激活嗅觉通路的“嗅三针”调控阿尔茨海默病转基因模型小鼠Aβ-MAPK-MG的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于UGC的应急响应决策支持系统关键技术研究

国家自然科学基金

12+阅读 · 2014年12月31日

基于非线性动力学的子宫收缩模型研究

国家自然科学基金

1+阅读 · 2013年12月31日

转录因子YY1在类风湿性关节炎发病中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

ADAMTS8在结直肠癌中的抑癌作用及其负调控MAPK/ERK通路的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于时频二维训练信息的高谱效多天线TFT-OFDM技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

lncRNA在JAK2V617F突变阳性的经典型骨髓增殖性肿瘤发病中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于液晶的可调谐WGM振荡器

国家自然科学基金

0+阅读 · 2009年12月31日

超高密度热辅助磁记录用FePt/FeRh垂直取向双层薄膜介质的研究

国家自然科学基金

0+阅读 · 2009年12月31日

多通路候选基因单体型与唇腭裂风险的相关性研究

国家自然科学基金

0+阅读 · 2009年12月31日

Question-Interlocutor Scope Realized Graph Modeling over Key Utterances for Dialogue Reading Comprehension

Arxiv

0+阅读 · 2022年10月26日

Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding

Arxiv

0+阅读 · 2022年10月26日

Auxiliary task discovery through generate-and-test

Arxiv

0+阅读 · 2022年10月25日

Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation

Arxiv

0+阅读 · 2022年10月21日

An Exploration of Data Efficiency in Intra-Dataset Task Transfer for Dialog Understanding

Arxiv

0+阅读 · 2022年10月21日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Memory Augmented Graph Neural Networks for Sequential Recommendation

Memory Augmented Graph Neural Networks for Sequential Recommendation

Arxiv

13+阅读 · 2019年12月26日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

任务对话系统

语言模型化

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【CIKM2019 Tutorial】Recommendation for Multi-Stakeholders and through Neural Review Mining，附158页PDF免费下载

【CIKM2019 Tutorial】Recommendation for Multi-Stakeholders and through Neural Review Mining，附158页PDF免费下载

专知会员服务

21+阅读 · 2019年11月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大型语言模型遇上文本属性图：一种融合框架与应用的综述

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

【博士论文】用于概率程序与生成模型的变分推断

军事指挥控制系统：2025年5种用途

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Question-Interlocutor Scope Realized Graph Modeling over Key Utterances for Dialogue Reading Comprehension

Arxiv

0+阅读 · 2022年10月26日

Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding

Arxiv

0+阅读 · 2022年10月26日

Auxiliary task discovery through generate-and-test

Arxiv

0+阅读 · 2022年10月25日

Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation

Arxiv

0+阅读 · 2022年10月21日

An Exploration of Data Efficiency in Intra-Dataset Task Transfer for Dialog Understanding

Arxiv

0+阅读 · 2022年10月21日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

Memory Augmented Graph Neural Networks for Sequential Recommendation

Memory Augmented Graph Neural Networks for Sequential Recommendation

Arxiv

13+阅读 · 2019年12月26日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

相关基金

激活嗅觉通路的“嗅三针”调控阿尔茨海默病转基因模型小鼠Aβ-MAPK-MG的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于UGC的应急响应决策支持系统关键技术研究

国家自然科学基金

12+阅读 · 2014年12月31日

基于非线性动力学的子宫收缩模型研究

国家自然科学基金

1+阅读 · 2013年12月31日

转录因子YY1在类风湿性关节炎发病中的作用及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

ADAMTS8在结直肠癌中的抑癌作用及其负调控MAPK/ERK通路的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于时频二维训练信息的高谱效多天线TFT-OFDM技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

lncRNA在JAK2V617F突变阳性的经典型骨髓增殖性肿瘤发病中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于液晶的可调谐WGM振荡器

国家自然科学基金

0+阅读 · 2009年12月31日

超高密度热辅助磁记录用FePt/FeRh垂直取向双层薄膜介质的研究

国家自然科学基金

0+阅读 · 2009年12月31日

多通路候选基因单体型与唇腭裂风险的相关性研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员