Taxonomy expansion is the process of incorporating a large number of additional nodes (i.e., "queries") into an existing taxonomy (i.e., "seed"), with the most important step being the selection of appropriate positions for each query. Enormous efforts have been made by exploring the seed's structure. However, existing approaches are deficient in their mining of structural information in two ways: poor modeling of the hierarchical semantics and failure to capture directionality of is-a relation. This paper seeks to address these issues by explicitly denoting each node as the combination of inherited feature (i.e., structural part) and incremental feature (i.e., supplementary part). Specifically, the inherited feature originates from "parent" nodes and is weighted by an inheritance factor. With this node representation, the hierarchy of semantics in taxonomies (i.e., the inheritance and accumulation of features from "parent" to "child") could be embodied. Additionally, based on this representation, the directionality of is-a relation could be easily translated into the irreversible inheritance of features. Inspired by the Darmois-Skitovich Theorem, we implement this irreversibility by a non-Gaussian constraint on the supplementary feature. A log-likelihood learning objective is further utilized to optimize the proposed model (dubbed DNG), whereby the required non-Gaussianity is also theoretically ensured. Extensive experimental results on two real-world datasets verify the superiority of DNG relative to several strong baselines.
翻译:分类法的扩展是将大量其他节点(即“查询”)合并到现有分类法(即“种子”)中的过程,其中最重要的步骤是选择每个查询的适当位置。现有方法通过研究种子的结构来做出巨大努力,但现有方法在两方面挖掘结构信息时存在不足:对层次语义的建模不佳,无法捕获is-a关系的方向性。本文通过明确表示每个节点为继承特征(即结构部分)和增量特征(即补充部分)的组合,来解决这些问题。具体而言,继承特征来自“父”节点,并受到继承因子的加权。使用此节点表示,可以体现分类法中的语义层次结构(从“父”到“子”的继承和积累特征)。此外,基于此表示,is-a关系的方向性可以轻松地翻译为特征不可逆地继承。受达尔苗 斯基托维奇定理启发,我们通过对补充特征施加非高斯约束来实现这种不可逆性。进一步地,我们利用对数似然学习目标来优化所提出的模型(称为DNG),从而理论上也保证了所需的非高斯性。在两个真实数据集上的广泛实验结果验证了DNG相对于几种较强比较方法的优越性。