Taxonomy expansion is the process of incorporating a large number of additional nodes (i.e., "queries") into an existing taxonomy (i.e., "seed"), with the most important step being the selection of appropriate positions for each query. Enormous efforts have been made by exploring the seed's structure. However, existing approaches are deficient in their mining of structural information in two ways: poor modeling of the hierarchical semantics and failure to capture directionality of is-a relation. This paper seeks to address these issues by explicitly denoting each node as the combination of inherited feature (i.e., structural part) and incremental feature (i.e., supplementary part). Specifically, the inherited feature originates from "parent" nodes and is weighted by an inheritance factor. With this node representation, the hierarchy of semantics in taxonomies (i.e., the inheritance and accumulation of features from "parent" to "child") could be embodied. Additionally, based on this representation, the directionality of is-a relation could be easily translated into the irreversible inheritance of features. Inspired by the Darmois-Skitovich Theorem, we implement this irreversibility by a non-Gaussian constraint on the supplementary feature. A log-likelihood learning objective is further utilized to optimize the proposed model (dubbed DNG), whereby the required non-Gaussianity is also theoretically ensured. Extensive experimental results on two real-world datasets verify the superiority of DNG relative to several strong baselines.
翻译:税收的扩大是将大量额外的节点(即“尖锐”)纳入现有分类学(即“种子”)的过程,其中最重要的步骤是选择每个查询的适当位置。通过探索种子的结构,已经做出了巨大的努力。但是,现有方法在以两种方式挖掘结构信息方面有缺陷:等级语义的模型化不力和未能掌握“儿童”关系的方向性。本文试图解决这些问题,明确将每个节点作为遗传特征(即,结构部分)和递增特征(即,补充部分)的组合,明确区分每个节点,其中最重要的一步是选择每个查询的适当位置。具体地说,所继承的特征起源于“亲”节点,通过探索种子结构结构的结构进行加权。有了这种节点,从“亲”到“儿童”的模型性能的继承和积累可以进一步体现。此外,基于这一表述,将每个节点的走向关系可以很容易地转化为不可逆的“亲”的精确性(也通过不可逆的实验性数据)的精确性结果。根据一个不可逆性结果,S-NG的精确性(也用一种不可逆性)的逻辑的精确性数据来进一步体现。