Knowledge bases (KBs) about notable entities and their properties are an important asset in applications such as search, question answering and dialogue. All popular KBs capture virtually only positive statements, and abstain from taking any stance on statements not stored in the KB. This paper makes the case for explicitly stating salient statements that do not hold. Negative statements are useful to overcome limitations of question answering systems that are mainly geared for positive questions; they can also contribute to informative summaries of entities. Due to the abundance of such invalid statements, any effort to compile them needs to address ranking by saliency. We present a statisticalinference method for compiling and ranking negative statements, based on expectations from positive statements of related entities in peer groups. Experimental results, with a variety of datasets, show that the method can effectively discover notable negative statements, and extrinsic studies underline their usefulness for entity summarization. Datasets and code are released as resources for further research.
翻译:有关显著实体及其特性的知识基础(KBs)是搜索、回答问题和对话等应用中的重要资产。所有受欢迎的KBs几乎只捕捉肯定的语句,对非储存在KB中的语句不持任何立场。本文说明了明确声明不持有的突出语句的理由。否定语句有助于克服主要针对积极问题的问答系统的局限性;它们也有助于对实体进行信息化总结。由于这些无效语句的丰富性,任何编纂这些语句的努力都需要按显著性进行排序。我们根据同行群体中相关实体的积极语句的预期,提出了汇编和排列否定语句的统计推论方法。实验结果通过各种数据集表明,该方法可以有效地发现显著的否定语句,外部研究则强调这些语句对实体总体化的有用性。数据集和代码作为进一步研究的资源被释放出来。