Network embedding approaches have recently attracted considerable interest as they learn low-dimensional vector representations of nodes. Embeddings based on the matrix factorization are effective but they are usually computationally expensive due to the eigen-decomposition step. In this paper, we propose a Random RangE FInder based Network Embedding (REFINE) algorithm, which can perform embedding on one million of nodes (YouTube) within 30 seconds in a single thread. REFINE is 10x faster than ProNE, which is 10-400x faster than other methods such as LINE, DeepWalk, Node2Vec, GraRep, and Hope. Firstly, we formulate our network embedding approach as a skip-gram model, but with an orthogonal constraint, and we reformulate it into the matrix factorization problem. Instead of using randomized tSVD (truncated SVD) as other methods, we employ the Randomized Blocked QR decomposition to obtain the node representation fast. Moreover, we design a simple but efficient spectral filter for network enhancement to obtain higher-order information for node representation. Experimental results prove that REFINE is very efficient on datasets of different sizes (from thousand to million of nodes/edges) for node classification, while enjoying a good performance.
翻译:网络嵌入方法最近引起了相当大的兴趣,因为它们学会了节点的低维矢量表示方式。 以矩阵系数化为基础的嵌入方式是有效的, 但是由于 eigen 分解步骤, 其计算成本通常非常昂贵 。 在本文件中, 我们提出一个随机 RangE FInder 网络嵌入( REFINE) 算法, 它可以在30 秒内用一条线嵌入一百万个节点( YouTube) 。 REFINE 比 ProNE( ProNE) 更快10x10- 400x, 比 ProNE、 DeepWalk、 Node2Vec、 Grarep 和Hope 等其他方法快10- 400x。 首先, 我们设计一个简单但高效的光谱过滤器, 作为跳过模型嵌入, 但是有正反向的制约, 我们将其重新配置为矩阵因子集成问题。 我们使用随机的 tSVD( trunated SVD) 作为其他方法, 我们使用随机的 QR 解解解解的解快速代表方式来快速获得节点代表方式。 此外, 我们设计了一个简单但高效的网络升级的图像过滤器, 以获取到高级的图像, 。