Subgraph-based graph representation learning (SGRL) has been recently proposed to deal with some fundamental challenges encountered by canonical graph neural networks (GNNs), and has demonstrated advantages in many important data science applications such as link, relation and motif prediction. However, current SGRL approaches suffer from scalability issues since they require extracting subgraphs for each training or test query. Recent solutions that scale up canonical GNNs may not apply to SGRL. Here, we propose a novel framework SUREL for scalable SGRL by co-designing the learning algorithm and its system support. SUREL adopts walk-based decomposition of subgraphs and reuses the walks to form subgraphs, which substantially reduces the redundancy of subgraph extraction and supports parallel computation. Experiments over six homogeneous, heterogeneous and higher-order graphs with millions of nodes and edges demonstrate the effectiveness and scalability of SUREL. In particular, compared to SGRL baselines, SUREL achieves 10$\times$ speed-up with comparable or even better prediction performance; while compared to canonical GNNs, SUREL achieves 50% prediction accuracy improvement.
翻译:最近提议了以地平面图表示法学习(SGRL),以应对康纳图神经网络(GNNS)遇到的一些基本挑战,并展示了许多重要的数据科学应用,如链接、关系和motif预测等的优势;然而,目前的SGRL方法存在可缩放问题,因为它们要求为每个培训或测试查询提取子图。最近扩大可塑GNS的解决方案可能不适用于SGRL。我们在这里提议了一个新颖的框架,通过共同设计学习算法和系统支持,使可扩展的SGRL达到可扩展的SGRL。SureL采用以行走为基础的分解法,再利用行走形成子图,这大大减少了子图提取的冗余,支持平行计算。在六个同质、多式和高顺序图上进行的实验显示了SURL的有效性和可缩放性。特别是,与SGRL基线相比,SINLSUR达到10美元的时间,以可比较或更好的预测性表现;与Canical的精确性预测相比,达到50 %。