GNNNSampler: 缩小GNN和硬件抽样算法与硬件之间的差距 (GNNSampler: Bridging the Gap between Sampling Algorithms of GNN and Hardware)

Sampling is a critical operation in the training of Graph Neural Network (GNN) that helps reduce the cost. Previous works have explored improving sampling algorithms through mathematical and statistical methods. However, there is a gap between sampling algorithms and hardware. Without consideration of hardware, algorithm designers merely optimize sampling at the algorithm level, missing the great potential of promoting the efficiency of existing sampling algorithms by leveraging hardware features. In this paper, we first propose a unified programming model for mainstream sampling algorithms, termed GNNSampler, covering the key processes for sampling algorithms in various categories. Second, we explore the data locality among nodes and their neighbors (i.e., the hardware feature) in real-world datasets for alleviating the irregular memory access in sampling. Third, we implement locality-aware optimizations in GNNSampler for diverse sampling algorithms to optimize the general sampling process in the training of GNN. Finally, we emphatically conduct experiments on large graph datasets to analyze the relevance between the training time, model accuracy, and hardware-level metrics, which helps achieve a good trade-off between time and accuracy in GNN training. Extensive experimental results show that our method is universal to mainstream sampling algorithms and reduces the training time of GNN (range from 4.83% with layer-wise sampling to 44.92% with subgraph-based sampling) with comparable accuracy.

翻译：取样是培训图表神经网络(GNN)的关键操作,有助于降低成本。先前的工作通过数学和统计方法探索了改进抽样算法。然而,抽样算法和硬件之间存在差距。不考虑硬件,算法设计师只是优化算法层面的取样,没有利用硬件特性提高现有抽样算法效率的巨大潜力。在本文件中,我们首先提出主流抽样算法的统一编程模式,称为GNNSampler,涵盖不同类别取样算法的关键过程。第二,我们探索实际世界数据集中节点及其邻居(即硬件特征)的数据位置,以减少抽样中不规则的记忆存取。第三,我们在GNNSampler实施地点认知优化,利用硬件特性提高现有抽样算法的效率。最后,我们重点在大型图表数据集上进行实验,以分析培训时间、模型精确度和硬件级别矩阵之间的相关性。第二,我们通过GNNNS的取样方法实现时间和精确度之间的良好交易,而GNNM的取样方法则从G83%到全球范围的取样方法,从GNNM的大规模实验结果显示从G的取样方法到G的精确程度。