NeutronTP: Load-Balanced Distributed Full-Graph GNN Training with Tensor Parallelism - 专知论文

会员服务 ·

0

GNN · 图 · Tensor · 划分 · 操作 ·

2024 年 12 月 29 日

NeutronTP: Load-Balanced Distributed Full-Graph GNN Training with Tensor Parallelism

翻译：暂无翻译

Xin Ai,Hao Yuan,Zeyu Ling,Qiange Wang,Yanfeng Zhang,Zhenbo Fu,Chaoyi Chen,Yu Gu,Ge Yu

from arxiv, 14 pages 16 figures, VLDB2025

Graph neural networks (GNNs) have emerged as a promising direction. Training large-scale graphs that relies on distributed computing power poses new challenges. Existing distributed GNN systems leverage data parallelism by partitioning the input graph and distributing it to multiple workers. However, due to the irregular nature of the graph structure, existing distributed approaches suffer from unbalanced workloads and high overhead in managing cross-worker vertex dependencies. In this paper, we leverage tensor parallelism for distributed GNN training. GNN tensor parallelism eliminates cross-worker vertex dependencies by partitioning features instead of graph structures. Different workers are assigned training tasks on different feature slices with the same dimensional size, leading to a complete load balance. We achieve efficient GNN tensor parallelism through two critical functions. Firstly, we employ a generalized decoupled training framework to decouple NN operations from graph aggregation operations, significantly reducing the communication overhead caused by NN operations which must be computed using complete features. Secondly, we employ a memory-efficient task scheduling strategy to support the training of large graphs exceeding single GPU memory, while further improving performance by overlapping communication and computation. By integrating the above techniques, we propose a distributed GNN training system NeutronTP. Our experimental results on a 16-node Aliyun cluster demonstrate that NeutronTP achieves 1.29X-8.72X speedup over state-of-the-art GNN systems including DistDGL, NeutronStar, and Sancus.

翻译：暂无翻译

0

相关内容

GNN

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

45+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

79+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

30+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

48+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

35+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

58+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

154+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

26+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

28+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

17+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

14+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

11+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

73+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

11+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

1+阅读 · 2015年12月31日

基于DASH的交互式三维视频系统建模

国家自然科学基金

1+阅读 · 2015年12月31日

DMB信号水汽探测方法若干问题研究

国家自然科学基金

1+阅读 · 2015年12月31日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

直接优化半周长线长的VLSI两阶段迭代布局算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

SDN数据平面中大规模流表的高性能查找方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

1+阅读 · 2014年12月31日

SRE-Conv: Symmetric Rotation Equivariant Convolution for Biomedical Image Classification

Arxiv

0+阅读 · 1月16日

ENCODE: Encoding NetFlows for Network Anomaly Detection

Arxiv

0+阅读 · 1月8日

AsynEVO: Asynchronous Event-Driven Visual Odometry for Pure Event Streams

Arxiv

0+阅读 · 1月6日

SCSC: A Novel Standards-Compatible Semantic Communication Framework for Image Transmission

Arxiv

0+阅读 · 1月6日

Temporal Information Reconstruction and Non-Aligned Residual in Spiking Neural Networks for Speech Classification

Arxiv

0+阅读 · 2024年12月31日

ArgMed-Agents: Explainable Clinical Decision Reasoning with LLM Disscusion via Argumentation Schemes

Arxiv

0+阅读 · 2024年12月28日

GCN-ABFT: Low-Cost Online Error Checking for Graph Convolutional Networks

Arxiv

0+阅读 · 2024年12月24日

L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression

Arxiv

0+阅读 · 2024年12月24日

SwinGS: Sliding Window Gaussian Splatting for Volumetric Video Streaming with Arbitrary Length

Arxiv

0+阅读 · 2024年12月23日

Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation

Arxiv

18+阅读 · 2019年11月20日

VIP会员

文章信息

相关主题

相关VIP内容

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

45+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

79+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

30+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

48+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

35+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

58+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

154+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《为美国防部和北约的 5G 制定敏捷标准流程》13页slides

《美国国防创新委员会：无人武器系统规模扩展路径》2025年最新报告

2025年美国陆军现代化：下一代作战坦克、两栖艇、未来垂直升降、先进武器、智能战场、训练

《美国防部战略管理规划：2024年度报告》最新176页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

26+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

28+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

17+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

14+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

11+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

73+阅读 · 2016年11月26日

相关论文

SRE-Conv: Symmetric Rotation Equivariant Convolution for Biomedical Image Classification

Arxiv

0+阅读 · 1月16日

ENCODE: Encoding NetFlows for Network Anomaly Detection

Arxiv

0+阅读 · 1月8日

AsynEVO: Asynchronous Event-Driven Visual Odometry for Pure Event Streams

Arxiv

0+阅读 · 1月6日

SCSC: A Novel Standards-Compatible Semantic Communication Framework for Image Transmission

Arxiv

0+阅读 · 1月6日

Temporal Information Reconstruction and Non-Aligned Residual in Spiking Neural Networks for Speech Classification

Arxiv

0+阅读 · 2024年12月31日

ArgMed-Agents: Explainable Clinical Decision Reasoning with LLM Disscusion via Argumentation Schemes

Arxiv

0+阅读 · 2024年12月28日

GCN-ABFT: Low-Cost Online Error Checking for Graph Convolutional Networks

Arxiv

0+阅读 · 2024年12月24日

L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression

Arxiv

0+阅读 · 2024年12月24日

SwinGS: Sliding Window Gaussian Splatting for Volumetric Video Streaming with Arbitrary Length

Arxiv

0+阅读 · 2024年12月23日

Knowledge Graph Alignment Network with Gated Multi-hop Neighborhood Aggregation

Arxiv

18+阅读 · 2019年11月20日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

11+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

1+阅读 · 2015年12月31日

基于DASH的交互式三维视频系统建模

国家自然科学基金

1+阅读 · 2015年12月31日

DMB信号水汽探测方法若干问题研究

国家自然科学基金

1+阅读 · 2015年12月31日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

直接优化半周长线长的VLSI两阶段迭代布局算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

SDN数据平面中大规模流表的高性能查找方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

1+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员