Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precision or recall, but hardly reconcile these complementary goals. This paper presents a methodology, called Ascent, to automatically build a large-scale knowledge base (KB) of CSK assertions, with advanced expressiveness and both better precision and recall than prior works. Ascent goes beyond triples by capturing composite concepts with subgroups and aspects, and by refining assertions with semantic facets. The latter are important to express temporal and spatial validity of assertions and further qualifiers. Ascent combines open information extraction with judicious cleaning using language models. Intrinsic evaluation shows the superior size and quality of the Ascent KB, and an extrinsic evaluation for QA-support tasks underlines the benefits of Ascent. A web interface, data and code can be found at https://ascent.mpi-inf.mpg.de/.
翻译:有关概念及其特性的常识知识(CSK)对于诸如强健的聊天室等AI应用是有用的。之前的工程,如概念网、图普勒KB和其他编成大型的CSK收藏,但对于具有简单S和P和O的单一字符串概念的主体预测对象(SPO)的三重(SPO),其表达性受到限制。此外,这些项目要么是优先精确度或回顾,但很难调和这些互补目标。本文介绍了一种方法,称为Aspent,以自动建立一个大型的CSK资料库(KB),与以前的工作相比,该资料库的清晰度和精确性都更高。Asplend超越了三倍,通过分组和方方面捕捉综合概念,并通过精化语义性方面来改进其描述。后者对于表达说法和进一步修饰物的时间和空间有效性十分重要。Aspent将公开的信息提取与使用语言模型的明智清洁性结合起来。Intrinsic 评估显示Ascent KB的高级规模和质量,以及QA-A/s sups supitemental commact codestrutes of Aspild Asmus.