Abstractive neural summarization models have seen great improvements in recent years, as shown by ROUGE scores of the generated summaries. But despite these improved metrics, there is limited understanding of the strategies different models employ, and how those strategies relate their understanding of language. To understand this better, we run several experiments to characterize how one popular abstractive model, the pointer-generator model of See et al. (2017), uses its explicit copy/generation switch to control its level of abstraction (generation) vs extraction (copying). On an extractive-biased dataset, the model utilizes syntactic boundaries to truncate sentences that are otherwise often copied verbatim. When we modify the copy/generation switch and force the model to generate, only simple paraphrasing abilities are revealed alongside factual inaccuracies and hallucinations. On an abstractive-biased dataset, the model copies infrequently but shows similarly limited abstractive abilities. In line with previous research, these results suggest that abstractive summarization models lack the semantic understanding necessary to generate paraphrases that are both abstractive and faithful to the source document.
翻译:近年来,如ROUGE所编摘要的数十分所显示的那样,抽象的神经总和模型取得了巨大的改进。但是,尽管有了这些改进的衡量尺度,对不同模型采用的战略以及这些战略如何与语言理解相联系的了解有限。为了更好地了解这一点,我们进行了几项实验,以说明一个流行的抽象模型,即See等人(2017年)的指针生成模型,如何使用其明确的复制/生成开关来控制其抽象(代)与提取(复印)的水平。在采掘偏差数据集中,模型利用综合边界来解解句子,这些词句子往往被逐字复制。当我们修改复制/生成开关并迫使模型生成时,只有简单的参数能力与事实的不准确性和幻觉一起被揭示。在抽象偏差的数据集中,模型复制过程不定期,但显示出类似的抽象能力。根据以前的研究,这些结果表明,抽象的总结模型缺乏必要的语义理解,以产生对原始文档既抽象又忠实的副词句。