知识图表背景化 (Weakly-supervised Contextualization of Knowledge Graph Facts)

Knowledge graphs (KGs) model facts about the world, they consist of nodes (entities such as companies and people) that are connected by edges (relations such as founderOf). Facts encoded in KGs are frequently used by search applications to augment result pages. When presenting a KG fact to the user, providing other facts that are pertinent to that main fact can enrich the user experience and support exploratory information needs. KG fact contextualization is the task of augmenting a given KG fact with additional and useful KG facts. The task is challenging because of the large size of KGs, discovering other relevant facts even in a small neighborhood of the given fact results in an enormous amount of candidates. We introduce a neural fact contextualization method (NFCM) to address the KG fact contextualization task. NFCM first generates a set of candidate facts in the neighborhood of a given fact and then ranks the candidate facts using a supervised learning to rank model. The ranking model combines features that we automatically learn from data and that represent the query-candidate facts with a set of hand-crafted features we devised or adjusted for this task. In order to obtain the annotations required to train the learning to rank model at scale, we generate training data automatically using distant supervision on a large entity-tagged text corpus. We show that ranking functions learned on this data are effective at contextualizing KG facts. Evaluation using human assessors shows that it significantly outperforms several competitive baselines.

翻译：有关世界的知识图形模型( KGs), 包括由边缘连接的节点( 诸如公司和人员等实体) 。 KGs 中编码的事实经常被搜索应用程序用来增加结果页面。当向用户展示 KG 事实时, 提供与该主要事实有关的其他事实, 可以丰富用户的经验, 支持探索信息需求。 KG 事实背景化是用额外的和有用的 KG 事实来补充给定的 KG 事实。任务具有挑战性, 因为 KGs 规模庞大, 发现其他相关的事实, 甚至在某个特定事实的狭小社区里发现其他的相关事实, 导致大量的候选人。我们引入一个神经事实背景化方法( NFCM) 来应对 KG 事实背景化任务。 NFCM 向用户展示一个与该主要事实相关的其他事实, 提供与该主要事实相关的其它事实, 并随后使用监督性学习的模型来排列候选事实。排序模型结合了我们自动从数据中学习的特征, 并代表一组手写式的、样式的特征, 显示大量的事实结果。我们设计或用远程的KCFormax, 来在远程的排序上学习该任务中学习该数据序列学。