Representing text as graph to solve the summarization task has been discussed for more than 10 years. However, with the development of attention or Transformer, the connection between attention and graph remains poorly understood. We demonstrate that the text structure can be analyzed through the attention matrix, which represents the relation between sentences by the attention weights. In this work, we show that the attention matrix produced in pre-training language model can be used as the adjacent matrix of graph convolutional network model. Our model performs a competitive result on 2 different datasets based on the ROUGE index. Also, with fewer parameters, the model reduces the computation resource when training and inferring.
翻译:10多年来,人们一直在讨论将文本作为图表来表示,以解决汇总任务。然而,随着注意力或变形器的发展,关注与图形之间的联系仍然不易理解。我们表明,可以通过注意矩阵分析文本结构,该矩阵代表了按注意权重排列的句号之间的关系。在这项工作中,我们表明,培训前语言模型产生的注意矩阵可以用作图变网络模型的相邻矩阵。我们的模型对基于ROUGE指数的2个不同数据集产生竞争结果。此外,在培训和推算时,该模型的参数较少,减少了计算资源。