In this work we propose R-GPM, a parallel computing framework for graph pattern mining (GPM) through a user-defined subgraph relation. More specifically, we enable the computation of statistics of patterns through their subgraph classes, generalizing traditional GPM methods. R-GPM provides efficient estimators for these statistics by employing a MCMC sampling algorithm combined with several optimizations. We provide both theoretical guarantees and empirical evaluations of our estimators in application scenarios such as stochastic optimization of deep high-order graph neural network models and pattern (motif) counting. We also propose and evaluate optimizations that enable improvements of our estimators accuracy, while reducing their computational costs in up to 3-orders-of-magnitude. Finally,we show that R-GPM is scalable, providing near-linear speedups on 44 cores in all of our tests.
翻译:在这项工作中,我们提议了R-GPM,这是一个通过用户定义的子谱关系进行图案模式采矿的平行计算框架。更具体地说,我们能够通过其子谱分类来计算图案的统计,并推广传统的GPM方法。R-GPM通过使用MCMC抽样算法,加上若干优化,为这些统计提供了有效的估计数据。我们提供了理论保障和对我们在应用情景中的估测者的经验性评估,例如深高阶图案神经网络模型和图案(motif)的随机优化。我们还提出和评估优化,以便能够改进我们的估计数据的准确性,同时将其计算成本降低到3级的放大率。最后,我们表明R-GMM可升级,在全部测试中提供44个核心的近线性加速。