Keyphrases, that concisely summarize the high-level topics discussed in a document, can be categorized into present keyphrase which explicitly appears in the source text, and absent keyphrase which does not match any contiguous subsequence but is highly semantically related to the source. Most existing keyphrase generation approaches synchronously generate present and absent keyphrases without explicitly distinguishing these two categories. In this paper, a Select-Guide-Generate (SGG) approach is proposed to deal with present and absent keyphrase generation separately with different mechanisms. Specifically, SGG is a hierarchical neural network which consists of a pointing-based selector at low layer concentrated on present keyphrase generation, a selection-guided generator at high layer dedicated to absent keyphrase generation, and a guider in the middle to transfer information from selector to generator. Experimental results on four keyphrase generation benchmarks demonstrate the effectiveness of our model, which significantly outperforms the strong baselines for both present and absent keyphrases generation. Furthermore, we extend SGG to a title generation task which indicates its extensibility in natural language generation tasks.
翻译:简明扼要地总结了文件中讨论的高层次议题的关键词句,可以归为源文本中明确出现的当前关键词句,而没有的关键词句则与源代码不相匹配,但与源代码关系密切。大多数现有的关键词句生成方法同步生成当前和不存在的关键词句,而没有明确区分这两个类别。本文建议采用“选择指南-格莱特(SGG)”方法,与不同机制分别处理当前和不存在的关键词生成。具体地说,SGG是一个等级神经网络,由在低层集中使用当前关键词生成的点基选择器、在高层专门不使用关键词生成的选定引导生成器以及中间将信息从选择词生成器传输到生成器的导师组成。四个关键词生成基准的实验结果显示了我们的模型的有效性,这大大超出了当前和不存在的关键词生成的强大基线。此外,我们将SGGGT扩展为标题生成任务,以显示其在自然语言生成任务中的可扩展性。