以有效的评价数据集衡量常识知识基础人口的基准 (Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset)

Reasoning over commonsense knowledge bases (CSKB) whose elements are in the form of free-text is an important yet hard task in NLP. While CSKB completion only fills the missing links within the domain of the CSKB, CSKB population is alternatively proposed with the goal of reasoning unseen assertions from external resources. In this task, CSKBs are grounded to a large-scale eventuality (activity, state, and event) graph to discriminate whether novel triples from the eventuality graph are plausible or not. However, existing evaluations on the population task are either not accurate (automatic evaluation with randomly sampled negative examples) or of small scale (human annotation). In this paper, we benchmark the CSKB population task with a new large-scale dataset by first aligning four popular CSKBs, and then presenting a high-quality human-annotated evaluation set to probe neural models' commonsense reasoning ability. We also propose a novel inductive commonsense reasoning model that reasons over graphs. Experimental results show that generalizing commonsense reasoning on unseen assertions is inherently a hard task. Models achieving high accuracy during training perform poorly on the evaluation set, with a large gap between human performance. We will make the data publicly available for future contributions. Codes and data are available at https://github.com/HKUST-KnowComp/CSKB-Population.

翻译：以普通知识库(CSKB)为依据,其要素以自由文本形式出现的新三重知识库(CSKB)是国家劳工局的一项重要而艰巨的任务。虽然CSKB的完成只是填补了CSKB范围内的缺失环节,但提出CSKB的人口,目的是从外部资源推理无法预见的断言。在这项任务中,CSKB基于一个大规模事件(活动、状态和事件)图,以区分从事件性图表中得出的新三重元素是否合理。然而,现有的人口任务评价要么不准确(自动评价,随机抽样的负面例子),要么是小规模(人文注解),虽然CSKB的完成只是填补了CSKB的缺失环节,目的是为了从外部资源中推理出新的大规模数据集。在这个文件中,我们首先对四种受欢迎的 CSKBB进行基准,然后提出高质量的人文说明性评估,以探究神经模型的理论推理能力。我们还提出了一个新的普通理论推理模型,其理由超过图表。实验结果显示,在普通数据库中普遍推理的通用推理学/CSB的粗略性推理,在可获取的粗略数据中,在可获取的数据模型中将必然地进行。

相关内容

知识库

关注 64

知识库(Knowledge Base)是知识工程中结构化，易操作，易利用，全面有组织的知识集群，是针对某一(或某些)领域问题求解的需要，采用某种(或若干)知识表示方式在计算机存储器中存储、组织、管理和使用的互相联系的知识片集合。这些知识片包括与领域相关的理论知识、事实数据，由专家经验得到的启发式知识，如某领域内有关的定义、定理和运算法则以及常识性知识等。

【USC2021】常识推理，47页ppt，Commonsense Reasoning in the Wild

专知会员服务

33+阅读 · 2021年10月9日

【因果基础】Causality Basics，36页ppt

专知会员服务

52+阅读 · 2021年8月8日

【斯坦福】从电子病历EHR构建知识图谱，Robustly Extracting Medical Knowledge from EHRs:A Case Study of Learning a Health Knowledge Graph

专知会员服务

56+阅读 · 2020年6月2日

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

专知会员服务

26+阅读 · 2020年5月6日