Graph embedding is a transformation of nodes of a graph into a set of vectors. A~good embedding should capture the graph topology, node-to-node relationship, and other relevant information about the graph, its subgraphs, and nodes. If these objectives are achieved, an embedding is a meaningful, understandable, compressed representations of a network that can be used for other machine learning tools such as node classification, community detection, or link prediction. The main challenge is that one needs to make sure that embeddings describe the properties of the graphs well. As a result, selecting the best embedding is a challenging task and very often requires domain experts. In this paper, we do a series of extensive experiments with selected graph embedding algorithms, both on real-world networks as well as artificially generated ones. Based on those experiments we formulate two general conclusions. First, if one needs to pick one embedding algorithm before running the experiments, then node2vec is the best choice as it performed best in our tests. Having said that, there is no single winner in all tests and, additionally, most embedding algorithms have hyperparameters that should be tuned and are randomized. Therefore, our main recommendation for practitioners is, if possible, to generate several embeddings for a problem at hand and then use a general framework that provides a tool for an unsupervised graph embedding comparison. This framework (introduced recently in the literature and easily available on GitHub repository) assigns the divergence score to embeddings to help distinguish good ones from bad ones.
翻译:嵌入是将图表的节点转换成一组矢量。 一个好的嵌入应该包含图形表层、 节点到节点关系, 以及图表、 其子图和节点的其他相关信息。 如果实现这些目标, 嵌入是一个有意义、 易懂、 压缩的网络表达形式, 可用于其他机器学习工具, 如节点分类、 社区检测或链接预测。 主要的挑战在于, 需要确保嵌入能够描述图表的特性。 因此, 选择最佳嵌入是一个具有挑战性的任务, 通常需要域专家。 在此文件中, 我们用选定的图形嵌入算法进行一系列广泛的实验, 既在现实世界网络上, 也是人工生成的。 基于这些实验, 我们制定两个一般性结论。 首先, 如果在进行实验前需要选择一个嵌入算算, 那么无de2vec 是最佳选择。 说, 在所有测试中, 没有任何单一的赢家, 并且, 也经常需要域专家来选择域内嵌入式框架。 如果一个普通的缩略度框架, 那么, 将产生一个普通的缩缩缩缩缩缩缩缩算, 可能, 。