Commonsense knowledge about object properties, human behavior and general concepts is crucial for robust AI applications. However, automatic acquisition of this knowledge is challenging because of sparseness and bias in online sources. This paper presents Quasimodo, a methodology and tool suite for distilling commonsense properties from non-standard web sources. We devise novel ways of tapping into search-engine query logs and QA forums, and combining the resulting candidate assertions with statistical cues from encyclopedias, books and image tags in a corroboration step. Unlike prior work on commonsense knowledge bases, Quasimodo focuses on salient properties that are typically associated with certain objects or concepts. Extensive evaluations, including extrinsic use-case studies, show that Quasimodo provides better coverage than state-of-the-art baselines with comparable quality.
翻译:关于物体特性、人类行为和一般概念的常识知识对于稳健的AI应用至关重要,然而,自动获取这种知识具有挑战性,因为网上来源稀少和偏差。本文介绍了Quasimodo,这是从非标准网络来源提炼常识特性的方法和工具套件。我们设计了新的方法来利用搜索引擎查询日志和质量保证论坛,并将由此产生的候选人说法与百科全书、书籍和图像标记的统计线索结合起来,在一个校准步骤中加以整合。与以前关于常识知识基础的工作不同,Quasimodo侧重于通常与某些物体或概念相关的突出特性。广泛的评估,包括外科学使用案例研究,显示Quasimodo提供的覆盖面比具有类似质量的最新基线要好。