通过缩写和归责对抽象总结模型生成模式进行解剖 (Dissecting Generation Modes for Abstractive Summarization Models via Ablation and Attribution)

Despite the prominence of neural abstractive summarization models, we know little about how they actually form summaries and how to understand where their decisions come from. We propose a two-step method to interpret summarization model decisions. We first analyze the model's behavior by ablating the full model to categorize each decoder decision into one of several generation modes: roughly, is the model behaving like a language model, is it relying heavily on the input, or is it somewhere in between? After isolating decisions that do depend on the input, we explore interpreting these decisions using several different attribution methods. We compare these techniques based on their ability to select content and reconstruct the model's predicted token from perturbations of the input, thus revealing whether highlighted attributions are truly important for the generation of the next token. While this machinery can be broadly useful even beyond summarization, we specifically demonstrate its capability to identify phrases the summarization model has memorized and determine where in the training pipeline this memorization happened, as well as study complex generation phenomena like sentence fusion on a per-instance basis.

翻译：尽管神经抽象总和模型具有显著性,但我们对这些模型如何实际形成摘要以及如何理解其决定的来源知之甚少。我们提出一个两步方法来解释总和模型决定。我们首先分析模型的行为,将每个解码器决定的完整模型划为几代模式之一: 大致上,该模型是否像语言模型一样表现得非常依赖输入,是严重依赖输入,还是介于两者之间? 在孤立了依赖输入的决定之后,我们利用几种不同的归因方法来解释这些决定。我们根据这些技术选择内容的能力来比较这些技术,并从输入的扰动中重建模型的预测符号,从而揭示突出的属性对于下一代的生成是否真正重要。虽然这一机制可以广泛有用,甚至超越了归因,但我们具体地展示了它确定归因模型的短语的能力,并确定了在培训管道中这种混合的发生地点,以及研究复杂的一代现象,例如按每次插入的句子融合。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CMU博士论文】可控文本生成，附107页pdf与Slides

专知会员服务

57+阅读 · 2021年4月21日

【EMNLP2020】自然语言生成，Neural Language Generation

专知会员服务

39+阅读 · 2020年11月20日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【CNN解释器】CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, N

专知会员服务

34+阅读 · 2020年4月30日