Topic-controllable summarization is an emerging research area with a wide range of potential applications. However, existing approaches suffer from significant limitations. First, there is currently no established evaluation metric for this task. Furthermore, existing methods built upon recurrent architectures, which can significantly limit their performance compared to more recent Transformer-based architectures, while they also require modifications to the model's architecture for controlling the topic. In this work, we propose a new topic-oriented evaluation measure to automatically evaluate the generated summaries based on the topic affinity between the generated summary and the desired topic. We also conducted a user study that validates the reliability of this measure. Finally, we propose simple, yet powerful methods for topic-controllable summarization either incorporating topic embeddings into the model's architecture or employing control tokens to guide the summary generation. Experimental results show that control tokens can achieve better performance compared to more complicated embedding-based approaches while being at the same time significantly faster.
翻译:专题可控制的总结是一个新兴研究领域,具有广泛的潜在应用范围。然而,现有方法存在很大的局限性。首先,目前没有关于这项任务的既定评价衡量标准。此外,现有方法建立在经常性结构上,与最近的基于变异器的结构相比,可以大大限制其性能,同时还需要修改模型结构以控制专题。在这项工作中,我们提议一个新的面向专题的评价措施,以自动评价根据专题产生的摘要,生成的摘要与预期的专题的近似性。我们还进行了用户研究,验证了这一计量的可靠性。最后,我们提出了简单而有力的专题可控总和方法,要么将专题嵌入模型结构,要么使用控制符号来指导摘要生成。实验结果表明,控制符号与更为复杂的嵌入方法相比,能够取得更好的性效果,同时速度要快得多。