Identifying the difference between two versions of the same article is useful to update knowledge bases and to understand how articles evolve. Paired texts occur naturally in diverse situations: reporters write similar news stories and maintainers of authoritative websites must keep their information up to date. We propose representing factual changes between paired documents as question-answer pairs, where the answer to the same question differs between two versions. We find that question-answer pairs can flexibly and concisely capture the updated contents. Provided with paired documents, annotators identify questions that are answered by one passage but answered differently or cannot be answered by the other. We release DIFFQG which consists of 759 QA pairs and 1153 examples of paired passages with no factual change. These questions are intended to be both unambiguous and information-seeking and involve complex edits, pushing beyond the capabilities of current question generation and factual change detection systems. Our dataset summarizes the changes between two versions of the document as questions and answers, studying automatic update summarization in a novel way.
翻译:在同一篇文章的两个版本之间找出差异,有助于更新知识基础,了解文章的演变方式。文稿自然地在不同情况下出现:记者撰写类似的新闻报道,权威网站的维护者必须不断更新信息。我们提议将对齐文件之间的事实变化作为问答对配,对齐文件的答案在两个版本之间有差异。我们认为问答对齐可以灵活和简洁地捕捉更新的内容。提供配对文件,说明员用对齐文件确定一个段落回答的问题,但回答不同或无法回答的问题。我们发布了由759对QA组成的DIFFQG和1 153个实例配对的段落,但无实际变化。这些问题既明确又寻求信息,涉及复杂的编辑,超出当前问题生成能力和事实变化探测系统的能力。我们的数据集总结了作为问答的两个版本之间的变化,以新颖的方式研究自动更新的合成。</s>