Dynamic graph neural network (DGNN) is becoming increasingly popular because of its widespread use in capturing dynamic features in the real world. A variety of dynamic graph neural networks designed from algorithmic perspectives have succeeded in incorporating temporal information into graph processing. Despite the promising algorithmic performance, deploying DGNNs on hardware presents additional challenges due to the model complexity, diversity, and the nature of the time dependency. Meanwhile, the differences between DGNNs and static graph neural networks make hardware-related optimizations for static graph neural networks unsuitable for DGNNs. In this paper, we select eight prevailing DGNNs with different characteristics and profile them on both CPU and GPU. The profiling results are summarized and analyzed, providing in-depth insights into the bottlenecks of DGNNs on hardware and identifying potential optimization opportunities for future DGNN acceleration. Followed by a comprehensive survey, we provide a detailed analysis of DGNN performance bottlenecks on hardware, including temporal data dependency, workload imbalance, data movement, and GPU warm-up. We suggest several optimizations from both software and hardware perspectives. This paper is the first to provide an in-depth analysis of the hardware performance of DGNN Code is available at https://github.com/sharc-lab/DGNN_analysis.
翻译:动态图神经网络(DGNN)由于其在捕捉真实世界中的动态特征方面的广泛应用而变得越来越流行。 从算法角度设计的各种动态图神经网络已成功地将时间信息纳入图处理中。 尽管算法性能有所改善,但在硬件上部署DGNN仍面临额外的挑战,这是由于模型复杂性,多样性和时间依赖性的特性。 同时,DGNN与静态图神经网络之间的差异使得为静态图神经网络进行的与硬件相关的优化不适用于DGNN。 在本文中,我们选择了八种具有不同特点的流行DGNN,并对它们在CPU和GPU上进行了性能分析。 性能分析结果得出,DGNN在硬件上的瓶颈问题,并为未来DGNN加速提供了可能的优化机会。 在综合调查的基础上,我们提供了对硬件性能的多方面DGNN性能瓶颈的详细分析,包括时间数据依赖性,工作负载不平衡,数据移动和GPU预热。 我们从软件和硬件的角度建议多种优化方案。 本文是首篇深入分析DGNN硬件性能的论文。 代码可在https://github.com/sharc-lab/DGNN_analysis上找到。