Textual knowledge bases such as Wikipedia require considerable effort to keep up to date and consistent. While automated writing assistants could potentially ease this burden, the problem of suggesting edits grounded in external knowledge has been under-explored. In this paper, we introduce the novel generation task of *faithfully reflecting updated information in text*(FRUIT) where the goal is to update an existing article given new evidence. We release the FRUIT-WIKI dataset, a collection of over 170K distantly supervised data produced from pairs of Wikipedia snapshots, along with our data generation pipeline and a gold evaluation set of 914 instances whose edits are guaranteed to be supported by the evidence. We provide benchmark results for popular generation systems as well as EDIT5 -- a T5-based approach tailored to editing we introduce that establishes the state of the art. Our analysis shows that developing models that can update articles faithfully requires new capabilities for neural generation models, and opens doors to many new applications.
翻译:维基百科等文字知识库需要大量努力跟上时间和一致性。 虽然自动写作助理有可能减轻这一负担, 但基于外部知识的编辑建议问题却没有得到充分探讨。 在本文中, 我们引入了新一代任务, 即“ 忠实地反映文本* (FRUIT)中的最新信息, 目的是更新现有的一篇文章, 并给出新的证据。 我们发布了FRUIT- WIKI数据集, 收集了来自维基百科短片配对制作的170多万个远方监督的数据, 以及我们的数据生成管道和914个黄金评价组, 其编辑工作有保证得到证据的支持。 我们为大众生成系统以及EDIT5提供了基准结果, 这是一种基于T5的编辑方法, 我们为编辑工作量身定制, 以建立艺术状态。 我们的分析显示, 开发能够忠实更新文章的模型需要神经生成模型的新能力, 并为许多新应用程序打开大门 。