Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precision or recall, but hardly reconcile these complementary goals. This paper presents a methodology, called Ascent, to automatically build a large-scale knowledge base (KB) of CSK assertions, with advanced expressiveness and both better precision and recall than prior works. Ascent goes beyond triples by capturing composite concepts with subgroups and aspects, and by refining assertions with semantic facets. The latter are important to express temporal and spatial validity of assertions and further qualifiers. Ascent combines open information extraction with judicious cleaning using language models. Intrinsic evaluation shows the superior size and quality of the Ascent KB, and an extrinsic evaluation for QA-support tasks underlines the benefits of Ascent.
翻译:关于概念及其特性的常识知识(CSK)对于诸如强健的聊天室等AI应用是有用的。以前的一些作品,如概念网、图普勒KB和其他编集大型的CSK收藏品,但对于具有简单S和P和O单立字符串概念的主体预测对象(SPO)的三重(SPO),其表达性受到限制。此外,这些项目要么是优先精确度或召回,但几乎无法调和这些互补目标。本文介绍了一种方法,称为Aspent, 自动建立一个大规模CSK资料库(KB),比以前的工作更明确,更精确,更能回顾。Astrinser评价超越了三倍,通过分组和方方面捕捉综合概念,并改进语义学方面的预测。后者对于表述的时空有效性和进一步的修饰物很重要。Asent将公开的信息提取与使用语言模型进行明智的清理结合起来。Interscial 评估显示Apent KB的高级规模和质量,以及对QA支持性任务的扩展评价强调了Ascent的好处。