Node embedding is a powerful approach for representing the structural role of each node in a graph. $\textit{Node2vec}$ is a widely used method for node embedding that works by exploring the local neighborhoods via biased random walks on the graph. However, $\textit{node2vec}$ does not consider edge weights when computing walk biases. This intrinsic limitation prevents $\textit{node2vec}$ from leveraging all the information in weighted graphs and, in turn, limits its application to many real-world networks that are weighted and dense. Here, we naturally extend $\textit{node2vec}$ to $\textit{node2vec+}$ in a way that accounts for edge weights when calculating walk biases, but which reduces to $\textit{node2vec}$ in the cases of unweighted graphs or unbiased walks. We empirically show that $\textit{node2vec+}$ is more robust to additive noise than $\textit{node2vec}$ in weighted graphs using two synthetic datasets. We also demonstrate that $\textit{node2vec+}$ significantly outperforms $\textit{node2vec}$ on a commonly benchmarked multi-label dataset (Wikipedia). Furthermore, we test $\textit{node2vec+}$ against GCN and GraphSAGE using various challenging gene classification tasks on two protein-protein interaction networks. Despite some clear advantages of GCN and GraphSAGE, they show comparable performance with $\textit{node2vec+}$. Finally, $\textit{node2vec+}$ can be used as a general approach for generating biased random walks, benefiting all existing methods built on top of $\textit{node2vec}$. $\textit{Node2vec+}$ is implemented as part of $\texttt{PecanPy}$, which is available at https://github.com/krishnanlab/PecanPy .
翻译:在图形中代表每个节点的结构作用时, Node 嵌入是一种强大的方法。 ${ textit{ node2verc}$是用来通过在图形中偏差随机行走来探索本地邻居的一种广泛使用的方法。 然而, $\ textit{ node2vec} 美元在计算行走偏差时并不考虑边际权重。 这种内在限制使 $\ textit{ node2com} 无法在加权图表中利用所有信息, 反过来, 将其应用限制在许多加权和稠密的真实世界网络。 在这里, 我们自然将 $\ textit{ node2ver} 隐藏到 $\ dedecommet, 在计算行走偏差时, 将边际权重算在内。 但是, 在未加权的图表或公正行走的情况下, 美元 textitle2 {no devoc} 方法可以比 $\ centremotretreat $@nodecentrentrentrentrick@no} 在加权图形中, 我们也可以使用两个直观数据显示, =xdedededededededede2 a dsdededededededededededededededededede