Understanding what online users may pay attention to is key to content recommendation and search services. These services will benefit from a highly structured and web-scale ontology of entities, concepts, events, topics and categories. While existing knowledge bases and taxonomies embody a large volume of entities and categories, we argue that they fail to discover properly grained concepts, events and topics in the language style of online population. Neither is a logically structured ontology maintained among these notions. In this paper, we present GIANT, a mechanism to construct a user-centered, web-scale, structured ontology, containing a large number of natural language phrases conforming to user attentions at various granularities, mined from a vast volume of web documents and search click graphs. Various types of edges are also constructed to maintain a hierarchy in the ontology. We present our graph-neural-network-based techniques used in GIANT, and evaluate the proposed methods as compared to a variety of baselines. GIANT has produced the Attention Ontology, which has been deployed in various Tencent applications involving over a billion users. Online A/B testing performed on Tencent QQ Browser shows that Attention Ontology can significantly improve click-through rates in news recommendation.
翻译:虽然现有的知识基础和分类包含大量实体和类别,但我们认为,它们未能以在线人口的语言形式正确发现有条不紊的概念、事件和主题,也没有在这些概念中保持逻辑结构的本体学。在本文件中,我们介绍了GIANT,这是构建一个以用户为中心的、网络规模的、结构化的本体学的机制,包含大量自然语言短语,在各种颗粒上与用户的注意相符,它们来自大量网络文件和搜索点击图。还建立了各种边缘,以维持本体学的等级。我们介绍了在GIANT中使用的基于图表的网络技术,并对照各种基线对拟议方法进行了评估。GIANT制作了《关注本体学》,在涉及10亿用户的各种Tententin应用中部署了这种关注。在线A/B测试在Tencent 浏览率上可以大大改进Tencent 浏览率。