Small subgraphs (graphlets) are important features to describe fundamental units of a large network. The calculation of the subgraph frequency distributions has a wide application in multiple domains including biology and engineering. Unfortunately due to the inherent complexity of this task, most of the existing methods are computationally intensive and inefficient. In this work, we propose GNNS, a novel representational learning framework that utilizes graph neural networks to sample subgraphs efficiently for estimating their frequency distribution. Our framework includes an inference model and a generative model that learns hierarchical embeddings of nodes, subgraphs, and graph types. With the learned model and embeddings, subgraphs are sampled in a highly scalable and parallel way and the frequency distribution estimation is then performed based on these sampled subgraphs. Eventually, our methods achieve comparable accuracy and a significant speedup by three orders of magnitude compared to existing methods.
翻译:小型子图(图)是描述大型网络基本单位的重要特征。子图频率分布的计算在生物和工程等多个领域广泛应用。不幸的是,由于这项任务的内在复杂性,大多数现有方法都是计算密集和低效率的。在这项工作中,我们提议GNNS,这是一个新的代表性学习框架,利用图形神经网络来有效地取样子图,以估计其频率分布。我们的框架包括一个推论模型和一个基因化模型,以学习节点、子图和图形类型的等级嵌入。随着所学的模型和嵌入,子图以高度可缩放和平行的方式取样,然后根据这些抽样子图进行频率分布估计。最终,我们的方法实现了相似的精确度,并且比现有方法高出了3个数量级的显著速度。