Abstractive summarization, the task of generating a concise summary of input documents, requires: (1) reasoning over the source document to determine the salient pieces of information scattered across the long document, and (2) composing a cohesive text by reconstructing these salient facts into a shorter summary that faithfully reflects the complex relations connecting these facts. In this paper, we adapt TP-TRANSFORMER (Schlag et al., 2019), an architecture that enriches the original Transformer (Vaswani et al., 2017) with the explicitly compositional Tensor Product Representation (TPR), for the task of abstractive summarization. The key feature of our model is a structural bias that we introduce by encoding two separate representations for each token to represent the syntactic structure (with role vectors) and semantic content (with filler vectors) separately. The model then binds the role and filler vectors into the TPR as the layer output. We argue that the structured intermediate representations enable the model to take better control of the contents (salient facts) and structures (the syntax that connects the facts) when generating the summary. Empirically, we show that our TP-TRANSFORMER outperforms the Transformer and the original TP-TRANSFORMER significantly on several abstractive summarization datasets based on both automatic and human evaluations. On several syntactic and semantic probing tasks, we demonstrate the emergent structural information in the role vectors and improved syntactic interpretability in the TPR layer outputs. Code and models are available at https://github.com/jiangycTarheel/TPT-Summ.
翻译:抽象的总结, 即生成输入文件简明摘要的任务, 需要:(1) 对源文件进行推理, 以确定分布在长文档中的信息的显著部分, 并且 (2) 通过将这些突出的事实重建成一个较短的概要, 忠实地反映这些事实之间的复杂关系, 从而形成一个连贯的文本。 在本文中, 我们调整了TP- TRANSFORMER (Schlag et al., 2019), 这个结构将原始变异器( Vaswani et al., 2017) 与明确的成份 Tensor 产品表示( TPR), 以完成抽象的合成。 我们模型的主要特征是结构偏重性偏重性, 我们通过将每个符号的两种单独的表达方式编码, 分别代表组合结构结构结构结构结构结构结构结构结构( 带有角色矢量 ) 和结构结构( 将原始变异性 IMFOR ) 显示原始的 IMER 数据 。