Graph representation learning has achieved great success in many areas, including e-commerce, chemistry, biology, etc. However, the fundamental problem of choosing the appropriate dimension of node embedding for a given graph still remains unsolved. The commonly used strategies for Node Embedding Dimension Selection (NEDS) based on grid search or empirical knowledge suffer from heavy computation and poor model performance. In this paper, we revisit NEDS from the perspective of minimum entropy principle. Subsequently, we propose a novel Minimum Graph Entropy (MinGE) algorithm for NEDS with graph data. To be specific, MinGE considers both feature entropy and structure entropy on graphs, which are carefully designed according to the characteristics of the rich information in them. The feature entropy, which assumes the embeddings of adjacent nodes to be more similar, connects node features and link topology on graphs. The structure entropy takes the normalized degree as basic unit to further measure the higher-order structure of graphs. Based on them, we design MinGE to directly calculate the ideal node embedding dimension for any graph. Finally, comprehensive experiments with popular Graph Neural Networks (GNNs) on benchmark datasets demonstrate the effectiveness and generalizability of our proposed MinGE.
翻译:在包括电子商务、化学、生物学等在内的许多领域,图示学习取得了巨大成功。然而,选择为特定图形嵌入节点的适当维度的根本问题仍未解决。基于网格搜索或经验知识的节点嵌入尺寸选择(NEDS)通常使用的战略存在大量计算和模型性能差的问题。在本文中,我们从最小对流原则的角度重新研究NEDS。随后,我们提出了带有图表数据的新颖的NEDS最低对流(MINGE)算法。具体地说,MINGE考虑图表上的特性的特性导体和结构导体,这是根据其中丰富信息的特点精心设计的。假定相邻节点嵌入式选择(NEDS)的常用战略是更相似的,将节点特性和图示的表层连接起来。我们从结构的标准化程度出发,进一步测量图表的更高顺序结构。我们设计了MINGE,直接计算任何图表上的理想节点嵌入维度和结构。最后,我们用普通的GNEG 实验显示我们通用的通用基准。