Curated knowledge graphs encode domain expertise and improve the performance of recommendation, segmentation, ad targeting, and other machine learning systems in several domains. As new concepts emerge in a domain, knowledge graphs must be expanded to preserve machine learning performance. Manually expanding knowledge graphs, however, is infeasible at scale. In this work, we propose a method for knowledge graph expansion with humans-in-the-loop. Concretely, given a knowledge graph, our method predicts the "parents" of new concepts to be added to this graph for further verification by human experts. We show that our method is both accurate and provably "human-friendly". Specifically, we prove that our method predicts parents that are "near" concepts' true parents in the knowledge graph, even when the predictions are incorrect. We then show, with a controlled experiment, that satisfying this property increases both the speed and the accuracy of the human-algorithm collaboration. We further evaluate our method on a knowledge graph from Pinterest and show that it outperforms competing methods on both accuracy and human-friendliness. Upon deployment in production at Pinterest, our method reduced the time needed for knowledge graph expansion by ~400% (compared to manual expansion), and contributed to a subsequent increase in ad revenue of 20%.
翻译:缩略知识图表将域内的专门知识编码成,并改进若干领域的建议、 分割、 目标选择和其他机器学习系统的绩效。 当一个领域出现新概念时, 知识图表必须扩大, 以保存机器学习的绩效。 但是, 手工扩展知识图表在规模上是行不通的。 在这项工作中, 我们建议了一种方法, 利用“ 人类” 来扩展知识图表。 具体地说, 有了知识图表, 我们的方法预测了要添加到这个图表中的新概念的“ 父母”, 供人类专家进一步核实。 我们显示, 我们的方法既准确又可以想象地“ 人类友好” 。 具体而言, 我们证明我们的方法预测了“ 接近” 概念的父母在知识图中的真实父母, 即使预测是不正确的。 我们然后通过一个有控制的实验表明, 满足这个属性既能提高人类- 语言合作的速度和准确性。 我们进一步评估了我们从“ 兴趣” 上增加知识图表的方法, 并显示它在准确和“ 人类友好性” 。 具体地说, 我们的方法可以证明我们的方法在“ ” 中预测出在“ 人类友好性 。 在生产中“ 接近 ” 和“ 扩展 增长 方法 增加” 的“ 方法 上增加” 的“, 我们的“ 增加 “ ” 的“ 增加” 的“ 的“ 的“ ” ” 和“ 增长” 的“ ” ” 到” 的“ 的“ 的“ 和“ 方法” 到” 增长” 到” 的“ 的“ 的“ 数据” 到” 的“ 增加” 到” 的“ 的“ 的“ 增加” 增加” 。