Recent deep learning models have moved beyond low-dimensional regular grids such as image, video, and speech, to high-dimensional graph-structured data, such as social networks, brain connections, and knowledge graphs. This evolution has led to large graph-based irregular and sparse models that go beyond what existing deep learning frameworks are designed for. Further, these models are not easily amenable to efficient, at scale, acceleration on parallel hardwares (e.g. GPUs). We introduce NGra, the first parallel processing framework for graph-based deep neural networks (GNNs). NGra presents a new SAGA-NN model for expressing deep neural networks as vertex programs with each layer in well-defined (Scatter, ApplyEdge, Gather, ApplyVertex) graph operation stages. This model not only allows GNNs to be expressed intuitively, but also facilitates the mapping to an efficient dataflow representation. NGra addresses the scalability challenge transparently through automatic graph partitioning and chunk-based stream processing out of GPU core or over multiple GPUs, which carefully considers data locality, data movement, and overlapping of parallel processing and data movement. NGra further achieves efficiency through highly optimized Scatter/Gather operators on GPUs despite its sparsity. Our evaluation shows that NGra scales to large real graphs that none of the existing frameworks can handle directly, while achieving up to about 4 times speedup even at small scales over the multiple-baseline design on TensorFlow.
翻译:最近深层次的学习模型已经超越了图像、视频和语音等低维常规网格,进入了高维图形结构化数据,如社交网络、大脑连接和知识图。这一演变导致了大型基于图形的非常规和稀薄模型,超出了现有深层学习框架的设计。此外,这些模型不易在规模上高效加速平行硬件(如GPUs),我们引入了NGRA,这是基于图形的深线网络(GNNSs)的第一个平行处理框架。NGRA展示了一个新的SAGA-NN模型,用以表达深层神经网络,作为每个层的顶端程序(Scaptatter、AppeEdge、Cunch、ApptVertex)的顶端。这个模型不仅让GNNNNNNN不仅能够直观地表达,而且方便了对高效数据流代表的映射。NGGG通过图形核心或多层GPUS, 仔细地思考了数据定位、数据定位、数据移动、数据移动数据移动数据移动的数据位置,在N-ARC的大规模平面操作中可以通过高级平面平面平面平面平面平面平面平标进行。