Sequence-to-sequence models provide a viable new approach to generative summarization, allowing models that are no longer limited to simply selecting and recombining sentences from the original text. However, these models have three drawbacks: their grasp of the details of the original text is often inaccurate, and the text generated by such models often has repetitions, while it is difficult to handle words that are beyond the word list. In this paper, we propose a new architecture that combines reinforcement learning and adversarial generative networks to enhance the sequence-to-sequence attention model. First, we use a hybrid pointer-generator network that copies words directly from the source text, contributing to accurate reproduction of information without sacrificing the ability of generators to generate new words. Second, we use both intra-temporal and intra-decoder attention to penalize summarized content and thus discourage repetition. We apply our model to our own proposed COVID-19 paper title summarization task and achieve close approximations to the current model on ROUEG, while bringing better readability.
翻译:序列到序列模型为基因总和提供了可行的新方法,使模型不再局限于简单地从原始文本中选择和重新组合句子,但是,这些模型有三个缺点:它们掌握原始文本的细节往往不准确,这些模型产生的文本往往重复,同时难以处理超出单词列表以外的单词。在本文件中,我们提议了一个新的结构,将强化学习和对立组合网结合起来,以加强顺序到顺序关注模型。首先,我们使用一个混合的指针生成器网络,直接复制源文本中的单词,有助于准确复制信息,而不损害生成者生成新词的能力。第二,我们利用时间内部和破坏器的关注来惩罚摘要内容,从而不鼓励重复。我们将我们的模型应用于我们提出的COVID-19纸张标题总结任务,并实现与ROUEG当前模型的近近近近近,同时提高可读性。