Knowledge bases (KBs) have gradually become a valuable asset for many AI applications. While many current KBs are quite large, they are widely acknowledged as incomplete, especially lacking facts of long-tail entities, e.g., less famous persons. Existing approaches enrich KBs mainly on completing missing links or filling missing values. However, they only tackle a part of the enrichment problem and lack specific considerations regarding long-tail entities. In this paper, we propose a full-fledged approach to knowledge enrichment, which predicts missing properties and infers true facts of long-tail entities from the open Web. Prior knowledge from popular entities is leveraged to improve every enrichment step. Our experiments on the synthetic and real-world datasets and comparison with related work demonstrate the feasibility and superiority of the approach.
翻译:知识基础(KBs)逐渐成为许多AI应用的宝贵资产。虽然许多现有的知识基础相当庞大,但被广泛认为是不完整的,特别是缺乏长尾实体(例如不太出名的人)的事实。现有方法主要在完成缺失的环节或填补缺失的值值方面丰富了知识基础;然而,它们只处理部分浓缩问题,对长尾实体缺乏具体的考虑。我们在本文件中建议对知识丰富采取全面的方法,预测缺失的特性,并从开放的网上推断出长尾实体的真实事实。从流行实体获得的先前知识被用来改进每一浓缩步骤。我们在合成和现实世界数据集方面的实验以及与相关工作的比较表明这一方法的可行性和优越性。