绿巨人：基于图神经网络的优化区域分布式计算系统 (Hulk: Graph Neural Networks for Optimizing Regionally Distributed Computing Systems) - 专知论文

会员服务 ·

0

并行 · 深度学习模型 · 系统 · 学习模型 · 图神经网络 ·

2023 年 4 月 13 日

Hulk: Graph Neural Networks for Optimizing Regionally Distributed Computing Systems

翻译：绿巨人：基于图神经网络的优化区域分布式计算系统

Zhengqing Yuan,Huiwen Xue,Chao Zhang,Yongming Liu

from arxiv, 16 pages,10 figures, Accepted by Intelligent Systems Conference(IntelliSys 2023)

Large deep learning models have shown great potential for delivering exceptional results in various applications. However, the training process can be incredibly challenging due to the models' vast parameter sizes, often consisting of hundreds of billions of parameters. Common distributed training methods, such as data parallelism, tensor parallelism, and pipeline parallelism, demand significant data communication throughout the process, leading to prolonged wait times for some machines in physically distant distributed systems. To address this issue, we propose a novel solution called Hulk, which utilizes a modified graph neural network to optimize distributed computing systems. Hulk not only optimizes data communication efficiency between different countries or even different regions within the same city, but also provides optimal distributed deployment of models in parallel. For example, it can place certain layers on a machine in a specific region or pass specific parameters of a model to a machine in a particular location. By using Hulk in experiments, we were able to improve the time efficiency of training large deep learning models on distributed systems by more than 20\%. Our open source collection of unlabeled data:https://github.com/DLYuanGod/Hulk.

翻译：大型深度学习模型在各种应用中展现出了巨大的潜力，然而模型的训练过程由于其庞大的参数数量（通常包含数百亿个参数）而变得异常艰辛。常用的分布式训练方法，例如数据并行、张量并行和管道并行，在整个过程中需要进行大量的数据交流，导致分布式系统中某些机器的等待时间延长，尤其是在物理上分离的地区。为了解决这个问题，我们提出了一种名为绿巨人（Hulk）的全新解决方案。绿巨人利用改进过的图神经网络来优化分布式计算系统，不仅可以优化不同国家甚至同一城市内不同地区之间的数据传输效率，而且可以在并行计算中提供最佳分布式部署。例如，它可以将某些层放置在特定地区的计算机上，或将模型的特定参数传递给特定位置的机器。通过在实验中使用绿巨人，我们成功将大型深度学习模型在分布式系统上的训练时间效率提高了20%以上。我们的未标记数据开源收集：https://github.com/DLYuanGod/Hulk.

0

相关内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

124+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

77+阅读 · 2022年3月15日

【Max Welling】图神经网络知识表示与推荐，Graph Neural Networks for Knowledge Representation and Recommendation

【Max Welling】图神经网络知识表示与推荐，Graph Neural Networks for Knowledge Representation and Recommendation

专知会员服务

44+阅读 · 2022年3月4日

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

105+阅读 · 2021年10月30日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

图神经网络库PyTorch geometric

图神经网络库PyTorch geometric

图与推荐

17+阅读 · 2020年3月22日

一行TensorFlow/Keras代码解决真实场景中数据不平衡(imbalanced)问题

一行TensorFlow/Keras代码解决真实场景中数据不平衡(imbalanced)问题

专知

78+阅读 · 2019年5月31日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【推荐】(Keras)LSTM多元时序预测教程

【推荐】(Keras)LSTM多元时序预测教程

机器学习研究会

24+阅读 · 2017年8月14日

利用复杂网络理論优化车载通信网络

国家自然科学基金

0+阅读 · 2014年12月31日

高密度三维封装TSV电迁移可靠性机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

大规模RFID系统标签的自适应高效准确识别策略研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于学习的复杂并行绘制系统负载平衡算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

重载交通条件下车路耦合系统非线性动力学行为精细化仿真

国家自然科学基金

0+阅读 · 2013年12月31日

基于网格的分布式雷达仿真系统关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于云计算的3D地震勘探专用GPS定位方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

机场飞行区安全风险演化机理及预警仿真系统研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于协作干扰的大规模无线网络自主物理层安全传输机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

光系统I稳态光谱和激子动力学过程的理论模拟

国家自然科学基金

0+阅读 · 2011年12月31日

Exact Distributed Stochastic Block Partitioning

Arxiv

0+阅读 · 2023年5月30日

Computation Offloading for Edge Computing in RIS-Assisted Symbiotic Radio Systems

Arxiv

0+阅读 · 2023年5月29日

Implicit Bias of Gradient Descent for Mean Squared Error Regression with Two-Layer Wide Neural Networks

Arxiv

0+阅读 · 2023年5月28日

Green Runner: A tool for efficient model selection from model repositories

Arxiv

0+阅读 · 2023年5月26日

Clustering Method for Time-Series Images Using Quantum-Inspired Computing Technology

Arxiv

0+阅读 · 2023年5月26日

Distributed Graph Neural Network Training: A Survey

Arxiv

16+阅读 · 2022年11月1日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

35+阅读 · 2022年4月25日

Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks

Arxiv

36+阅读 · 2020年5月24日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

VIP会员

文章信息

相关主题

深度学习模型

图神经网络

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

124+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

77+阅读 · 2022年3月15日

【Max Welling】图神经网络知识表示与推荐，Graph Neural Networks for Knowledge Representation and Recommendation

【Max Welling】图神经网络知识表示与推荐，Graph Neural Networks for Knowledge Representation and Recommendation

专知会员服务

44+阅读 · 2022年3月4日

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

105+阅读 · 2021年10月30日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《可信的医学问答：以评估为中心的综述》

深度学习视频超分辨率综述

2025年人工智能趋势报告（中英文版）｜附340页PDF文件下载

【剑桥博士论文】基于图像的三维重建：神经隐式表示的可微渲染方法

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

图神经网络库PyTorch geometric

图神经网络库PyTorch geometric

图与推荐

17+阅读 · 2020年3月22日

一行TensorFlow/Keras代码解决真实场景中数据不平衡(imbalanced)问题

一行TensorFlow/Keras代码解决真实场景中数据不平衡(imbalanced)问题

专知

78+阅读 · 2019年5月31日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【推荐】(Keras)LSTM多元时序预测教程

【推荐】(Keras)LSTM多元时序预测教程

机器学习研究会

24+阅读 · 2017年8月14日

相关论文

Exact Distributed Stochastic Block Partitioning

Arxiv

0+阅读 · 2023年5月30日

Computation Offloading for Edge Computing in RIS-Assisted Symbiotic Radio Systems

Arxiv

0+阅读 · 2023年5月29日

Implicit Bias of Gradient Descent for Mean Squared Error Regression with Two-Layer Wide Neural Networks

Arxiv

0+阅读 · 2023年5月28日

Green Runner: A tool for efficient model selection from model repositories

Arxiv

0+阅读 · 2023年5月26日

Clustering Method for Time-Series Images Using Quantum-Inspired Computing Technology

Arxiv

0+阅读 · 2023年5月26日

Distributed Graph Neural Network Training: A Survey

Arxiv

16+阅读 · 2022年11月1日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

35+阅读 · 2022年4月25日

Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks

Arxiv

36+阅读 · 2020年5月24日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

相关基金

利用复杂网络理論优化车载通信网络

国家自然科学基金

0+阅读 · 2014年12月31日

高密度三维封装TSV电迁移可靠性机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

大规模RFID系统标签的自适应高效准确识别策略研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于学习的复杂并行绘制系统负载平衡算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

重载交通条件下车路耦合系统非线性动力学行为精细化仿真

国家自然科学基金

0+阅读 · 2013年12月31日

基于网格的分布式雷达仿真系统关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于云计算的3D地震勘探专用GPS定位方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

机场飞行区安全风险演化机理及预警仿真系统研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于协作干扰的大规模无线网络自主物理层安全传输机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

光系统I稳态光谱和激子动力学过程的理论模拟

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员