任务定向对话框系统的等级变换器 (Hierarchical Transformer for Task Oriented Dialog Systems)

from arxiv, v3: Latest camera ready version; 10 pages; Codes: https://github.com/bsantraigi/HIER , https://github.com/bsantraigi/hier-transformer-pytorch v2: To appear in NAACL 2021 (Long Paper) v1: preprint

Generative models for dialog systems have gained much interest because of the recent success of RNN and Transformer based models in tasks like question answering and summarization. Although the task of dialog response generation is generally seen as a sequence-to-sequence (Seq2Seq) problem, researchers in the past have found it challenging to train dialog systems using the standard Seq2Seq models. Therefore, to help the model learn meaningful utterance and conversation level features, Sordoni et al. (2015b); Serban et al. (2016) proposed Hierarchical RNN architecture, which was later adopted by several other RNN based dialog systems. With the transformer-based models dominating the seq2seq problems lately, the natural question to ask is the applicability of the notion of hierarchy in transformer based dialog systems. In this paper, we propose a generalized framework for Hierarchical Transformer Encoders and show how a standard transformer can be morphed into any hierarchical encoder, including HRED and HIBERT like models, by using specially designed attention masks and positional encodings. We demonstrate that Hierarchical Encoding helps achieve better natural language understanding of the contexts in transformer-based models for task-oriented dialog systems through a wide range of experiments.

翻译：由于最近基于 RNN 和变异器的RNN 和变异模型在问答和概括等任务中取得了成功,因此对话系统生成模型引起了很大的兴趣。虽然对话响应生成的任务通常被视为一个序列到序列问题(Seq2Seqeq),但过去研究人员发现,使用标准Seq2Seqeq 模式来培训对话系统具有挑战性。因此,为了帮助模型学习有意义的发音和对话级别特征,Sordoni等人(2015年b);Serban等人(2015年b);Serban等人(2016年)建议建立基于等级的 RNNN 结构,后来被其他基于 RNN 的多个对话系统采用。由于基于变异器的模型最近主导了后继2等问题,因此要问的自然问题是变异器对话系统中等级概念的可适用性。在本文件中,我们建议了一个关于高端变异器编码的通用框架,并表明标准变异器如何通过特别设计的遮罩和定位编码等模型,变成任何等级的编码。我们证明,高端造型变换式的变式模型有助于在变换式系统中实现更广义的变形系统。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

【斯坦福大学】矩阵对策的协调方法，89页pdf

专知会员服务

27+阅读 · 2020年9月18日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日