Curated knowledge graphs encode domain expertise and improve the performance of recommendation, segmentation, ad targeting, and other machine learning systems in several domains. As new concepts emerge in a domain, knowledge graphs must be expanded to preserve machine learning performance. Manually expanding knowledge graphs, however, is infeasible at scale. In this work, we propose a method for knowledge graph expansion with humans-in-the-loop. Concretely, given a knowledge graph, our method predicts the "parents" of new concepts to be added to this graph for further verification by human experts. We show that our method is both accurate and provably "human-friendly". Specifically, we prove that our method predicts parents that are "near" concepts' true parents in the knowledge graph, even when the predictions are incorrect. We then show, with a controlled experiment, that satisfying this property increases both the speed and the accuracy of the human-algorithm collaboration. We further evaluate our method on a knowledge graph from Pinterest and show that it outperforms competing methods on both accuracy and human-friendliness. Upon deployment in production at Pinterest, our method reduced the time needed for knowledge graph expansion by ~400% (compared to manual expansion), and contributed to a subsequent increase in ad revenue of 20%.
翻译:摘要: 精心策划的知识图谱编码领域专业知识,提高了推荐、分割、广告定向等机器学习系统在多个领域的性能。随着领域出现新概念,必须扩展知识图谱以保留机器学习的性能。然而,手动扩展知识图谱在规模上是不可行的。在这项工作中,我们提出了一种有人参与的知识图谱扩展方法。具体来说,给定一个知识图谱,我们的方法预测要添加到该图谱中的新概念的“父级”,以供人类专家进一步验证。我们展示了我们的方法既准确又可证明“人性化”。具体地,我们证明了我们的方法预测的父级在接近概念真实父级的同时,即使预测不正确,也会减小人机协作所需的时间并提高准确性。我们进一步评估了来自Pinterest的知识图谱的方法,并展示了它在准确性和人性化方面优于竞争方法。在Pinterest生产中部署后,我们的方法将知识图谱扩展所需的时间减少了约400%(与手动扩展相比),并导致广告收入增加了20%。