Automatically constructing taxonomy finds many applications in e-commerce and web search. One critical challenge is as data and business scope grow in real applications, new concepts are emerging and needed to be added to the existing taxonomy. Previous approaches focus on the taxonomy expansion, i.e. finding an appropriate hypernym concept from the taxonomy for a new query concept. In this paper, we formulate a new task, "taxonomy completion", by discovering both the hypernym and hyponym concepts for a query. We propose Triplet Matching Network (TMN), to find the appropriate <hypernym, hyponym> pairs for a given query concept. TMN consists of one primal scorer and multiple auxiliary scorers. These auxiliary scorers capture various fine-grained signals (e.g., query to hypernym or query to hyponym semantics), and the primal scorer makes a holistic prediction on <query, hypernym, hyponym> triplet based on the internal feature representations of all auxiliary scorers. Also, an innovative channel-wise gating mechanism that retains task-specific information in concept representations is introduced to further boost model performance. Experiments on four real-world large-scale datasets show that TMN achieves the best performance on both taxonomy completion task and the previous taxonomy expansion task, outperforming existing methods.
翻译:自动构建分类学在电子商务和网络搜索中有许多应用。一个关键的挑战在于数据和商业范围在实际应用中不断增长,新概念正在出现,需要加入现有的分类学。以前的方法侧重于分类学扩展,即从分类学中找到适合的超感应概念,用于新的查询概念。在本文件中,我们通过发现用于查询的超尼和低尼概念来制定新的任务,即“完成分类学”。我们建议三联匹配网络(TMN),为特定查询概念找到合适的“超尼、低尼美”配对。TMN由1个原始得分和多个辅助得分组成。这些辅助得分者捕捉到各种细微感信号(例如,对超尼特或低尼特的精度语学查询),而原始得分计分仪根据所有辅助得分者的内部特征显示,对“超声调、低尼”匹配网络(TMN)进行整体预测。此外,一个创新的频道化配对称是多个辅助得分计分器和多个辅助得分分分分器。这些辅助得分分计分器捕捉到各种细微分信号信号信号信号信号信号,既能、又显示了当前完成任务任务的成绩,既能、又又又显示现有四进制式的进度。