Keyphrase generation is the task of generating phrases (keyphrases) that summarize the main topics of a given document. The generated keyphrases can be either present or absent from the text of the given document. While the extraction of present keyphrases has received much attention in the past, only recently a stronger focus has been placed on the generation of absent keyphrases. However, generating absent keyphrases is very challenging; even the best methods show only a modest degree of success. In this paper, we propose an approach, called keyphrase dropout (or KPDrop), to improve absent keyphrase generation. We randomly drop present keyphrases from the document and turn them into artificial absent keyphrases during training. We test our approach extensively and show that it consistently improves the absent performance of strong baselines in keyphrase generation.
翻译:关键词句生成是生成短语(关键词句)的任务,这些短语可以总结给定文档的主要专题。生成的关键词句可以在给定文档的文本中出现或不存在。虽然当前关键词句的提取在过去受到了很多关注,但直到最近才更加侧重于缺失的关键词句的生成。然而,生成缺失的关键词句非常具有挑战性;即使最佳方法也只能显示一定程度的成功。在本文中,我们提议一种方法,称为关键词删除(或 KPDrop),以改善缺失的关键词生成。我们随机从文档中丢弃关键词句,并将其转换为培训期间人为缺失的关键词句。我们广泛测试我们的方法,并显示它始终在改进关键词生成中缺少强基线的性能。