HYLSum: 使用多解码模型对文本摘要中脱钩时的时态特征 (HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models)

Summarization systems make numerous "decisions" about summary properties during inference, e.g. degree of copying, specificity and length of outputs, etc. However, these are implicitly encoded within model parameters and specific styles cannot be enforced. To address this, we introduce HydraSum, a new summarization architecture that extends the single decoder framework of current models to a mixture-of-experts version with multiple decoders. We show that HydraSum's multiple decoders automatically learn contrasting summary styles when trained under the standard training objective without any extra supervision. Through experiments on three summarization datasets (CNN, Newsroom and XSum), we show that HydraSum provides a simple mechanism to obtain stylistically-diverse summaries by sampling from either individual decoders or their mixtures, outperforming baseline models. Finally, we demonstrate that a small modification to the gating strategy during training can enforce an even stricter style partitioning, e.g. high- vs low-abstractiveness or high- vs low-specificity, allowing users to sample from a larger area in the generation space and vary summary styles along multiple dimensions.

翻译：简略系统在推断过程中对摘要属性做出许多“决定”,例如复制的程度、特性和产出长度等。然而,这些是在模型参数和具体风格中暗含编码的。为此,我们引入了HydraSum,这是一个新的总称结构,将当前模型的单一解码框架扩展为专家混合版本,并配有多个解码器。我们显示,在按照标准培训目标培训时,HydraSum的多个解码器自动学习对比摘要样式,而无需任何额外的监督。通过对三个汇总数据集(CNN、Newsroom和XSum)的实验,我们显示,HyalSum提供了一个简单的机制,通过对单个解码器或其混合物进行取样,将当前模型的单一解码框架扩展为多种解码模型。最后,我们证明,在培训期间对格调战略作小小改动,可以实施更为严格的风格分隔,例如高调低调或高调低调,或低调低调,允许用户从一代空间的更大区域进行抽样,并沿多种类型进行总结。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/