Graph embedding maps a graph into a convenient vector-space representation for graph analysis and machine learning applications. Many graph embedding methods hinge on a sampling of context nodes based on random walks. However, random walks can be a biased sampler due to the structural properties of graphs. Most notably, random walks are biased by the degree of each node, where a node is sampled proportionally to its degree. The implication of such biases has not been clear, particularly in the context of graph representation learning. Here, we investigate the impact of the random walks' bias on graph embedding and propose residual2vec, a general graph embedding method that can debias various structural biases in graphs by using random graphs. We demonstrate that this debiasing not only improves link prediction and clustering performance but also allows us to explicitly model salient structural properties in graph embedding.
翻译:图形将图表嵌入一个方便的矢量- 空间代表图, 用于图形分析和机器学习应用程序。 许多图形嵌入方法取决于基于随机行走的上下文节点的抽样。 但是, 随机行走可能因图形的结构属性而带有偏向性。 最明显的是, 随机行走因每个节点的程度而有偏差, 节点的抽样与其程度成比例。 这种偏差的影响并不明确, 特别是在图形表达学习中。 在此, 我们调查随机行走偏差对图形嵌入和提议剩余2vec的影响, 这是一种一般图形嵌入方法, 可以通过使用随机图形来降低图形中的各种结构偏差。 我们证明, 这种偏差不仅改善了链接预测和组合性, 而且还使我们能够在图形嵌入中明确建模突出的结构属性 。