Understanding inferences and answering questions from text requires more than merely recovering surface arguments, adjuncts, or strings associated with the query terms. As humans, we interpret sentences as contextualized components of a narrative or discourse, by both filling in missing information, and reasoning about event consequences. In this paper, we define the process of rewriting a textual expression (lexeme or phrase) such that it reduces ambiguity while also making explicit the underlying semantics that is not (necessarily) expressed in the economy of sentence structure as Dense Paraphrasing (DP). We build the first complete DP dataset, provide the scope and design of the annotation task, and present results demonstrating how this DP process can enrich a source text to improve inferencing and QA task performance. The data and the source code will be publicly available.
翻译:理解和回答文本中的问题要求的不仅仅是恢复与查询术语有关的表面论点、辅助论点或字符串。作为人类,我们将句子解释为叙述或谈话的背景化组成部分,既填补缺失的信息,又对事件后果进行推理。在本文件中,我们界定了改写文字表达(灵活或短语)的过程,以便减少模棱两可之处,同时明确了句子结构经济中没有(必然)以Dense Paraphrasing(DP)表示的基本语义。我们建立了第一个完整的DP数据集,提供了批注任务的范围和设计,并展示了这一DP进程如何丰富源文本,以改进推论和QA任务性能。数据和源代码将公开提供。