知识库(Knowledge Base)是知识工程中结构化,易操作,易利用,全面有组织的知识集群,是针对某一(或某些)领域问题求解的需要,采用某种(或若干)知识表示方式在计算 机存储器中 存储、组织、管理和使用的互相联系的知识片集合。这些知识片包括与领域相关的理论知识、事实数据,由专家经验得到的启发式知识,如某领域内有关的定义、定 理和运算法则以及常识性知识等。

VIP内容

VLDB会议全称International Conference on Very Large Date Bases,是数据库领域的顶级学术会议和另外两大数据库会议SIGMOD、ICD共同构成了数据库领域的三大顶级会议。本教程讲述知识图谱相关主题。

通用知识库(KBs)是一些数据驱动应用的重要组件。从可用的网络资源实际构建的这些KBs远未完成,这在管理和使用方面提出了一系列挑战。在本教程中,我们将讨论如何表示、提取和推断DBs和KBs中的完整性、召回和否定。我们首先介绍了部分封闭世界语义下知识表示和查询的逻辑基础。(ii)我们展示了如何在KBs和文本中识别召回信息,以及(iii)如何通过统计模式估计召回信息。(iv)我们展示了如何识别有趣的否定陈述,以及(v)如何在比较概念中定位召回。

像Wikidata[32]、DBpedia[2]或Yago[30]这样的网络规模知识库(KBs)被用于从问答到个人助理的各种应用中。它们从网络资源中构建而成,专注于代表积极的知识,即真实的陈述。它们不存储否定语句。它们也是不完整的,也就是说,它们不包含感兴趣领域的所有真实陈述。这意味着,如果一条语句不在知识库中,我们就不知道它在现实世界中是假的,还是只是不存在。

这给KBs的管理和应用带来了重大挑战: 首先,知识库管理人员可能想知道知识库在哪里不完整,以便他们可以优先完成工作。这尤其适用于像NELL[4]这样的KBs,他们想要自动补全。其次,KB应用程序需要知道哪里的数据是不完整的,以便向最终用户发出质量问题的提示。例如,如果KB中恰好没有东京,那么查询“日本最大的城市”可能返回错误的答案。类似地,在企业设置中用于问答的知识库需要知道某个问题何时超出了它的知识[22]。这尤其适用于布尔问题,如“空客制造了这架飞机吗”,在这种情况下,“不”可能仅仅来自丢失的信息。最后,对于总结关于一个实体的显著信息的要求,一个全面的回答还应该包含不适用的显著事实。

传统上,知识库的构建和保存主要集中在出处和准确性方面[23,33]。然而,近年来,描述回忆和负面知识的形式主义日趋成熟[1,5,18],估计召回的统计和基于文本的方法也在兴起[3,7,12 - 14,17,24,29]和推导负面陈述[1,13]。将这些方法系统化,并使它们能够被普通数据库用户访问,是本教程的主题。本教程对理论和实践都有兴趣。它将向听众介绍完整性评估和否定方面的最新进展,并向他们提供一整套方法,以便更好地代表和评估特定数据集的召回。

成为VIP会员查看完整内容
0
50

最新内容

Reasoning over commonsense knowledge bases (CSKB) whose elements are in the form of free-text is an important yet hard task in NLP. While CSKB completion only fills the missing links within the domain of the CSKB, CSKB population is alternatively proposed with the goal of reasoning unseen assertions from external resources. In this task, CSKBs are grounded to a large-scale eventuality (activity, state, and event) graph to discriminate whether novel triples from the eventuality graph are plausible or not. However, existing evaluations on the population task are either not accurate (automatic evaluation with randomly sampled negative examples) or of small scale (human annotation). In this paper, we benchmark the CSKB population task with a new large-scale dataset by first aligning four popular CSKBs, and then presenting a high-quality human-annotated evaluation set to probe neural models' commonsense reasoning ability. We also propose a novel inductive commonsense reasoning model that reasons over graphs. Experimental results show that generalizing commonsense reasoning on unseen assertions is inherently a hard task. Models achieving high accuracy during training perform poorly on the evaluation set, with a large gap between human performance. We will make the data publicly available for future contributions. Codes and data are available at https://github.com/HKUST-KnowComp/CSKB-Population.

0
0
下载
预览

最新论文

Reasoning over commonsense knowledge bases (CSKB) whose elements are in the form of free-text is an important yet hard task in NLP. While CSKB completion only fills the missing links within the domain of the CSKB, CSKB population is alternatively proposed with the goal of reasoning unseen assertions from external resources. In this task, CSKBs are grounded to a large-scale eventuality (activity, state, and event) graph to discriminate whether novel triples from the eventuality graph are plausible or not. However, existing evaluations on the population task are either not accurate (automatic evaluation with randomly sampled negative examples) or of small scale (human annotation). In this paper, we benchmark the CSKB population task with a new large-scale dataset by first aligning four popular CSKBs, and then presenting a high-quality human-annotated evaluation set to probe neural models' commonsense reasoning ability. We also propose a novel inductive commonsense reasoning model that reasons over graphs. Experimental results show that generalizing commonsense reasoning on unseen assertions is inherently a hard task. Models achieving high accuracy during training perform poorly on the evaluation set, with a large gap between human performance. We will make the data publicly available for future contributions. Codes and data are available at https://github.com/HKUST-KnowComp/CSKB-Population.

0
0
下载
预览
父主题
子主题
Top