Learning good quality neural graph embeddings has long been achieved by minimizing the point-wise mutual information (PMI) for co-occurring nodes in simulated random walks. This design choice has been mostly popularized by the direct application of the highly-successful word embedding algorithm word2vec to predicting the formation of new links in social, co-citation, and biological networks. However, such a skeuomorphic design of graph embedding methods entails a truncation of information coming from pairs of nodes with low PMI. To circumvent this issue, we propose an improved approach to learning low-rank factorization embeddings that incorporate information from such unlikely pairs of nodes and show that it can improve the link prediction performance of baseline methods from 1.2% to 24.2%. Based on our results and observations we outline further steps that could improve the design of next graph embedding algorithms that are based on matrix factorization.
翻译:学习质量良好的神经图嵌入早已通过最大限度地减少用于模拟随机行走中同时出现的节点的点点相互信息(PMI)而实现了学习质量良好的神经图嵌入。这一设计选择大多通过直接应用高度成功的单词嵌入算法单词2vec 来预测社会、共同引用和生物网络中新链接的形成而得到普及。然而,这种图形嵌入方法的偏浮性设计需要截断来自PMI低的对结点的信息。为避免这一问题,我们建议采用更好的方法学习低等级的分解因子嵌入嵌入器,其中纳入这种不太可能的节点组合的信息,并表明它能够将基线方法的预测性能从1.2%改进到24.2%的链接。我们根据我们的结果和观察,概述了可以改进基于矩阵因子化的下一个图形嵌入算法的设计的进一步步骤。