DNN 资源调度的 Ratios 多维多维- Knapsack 解析总和 (A Sum-of-Ratios Multi-Dimensional-Knapsack Decomposition for DNN Resource Scheduling) - 专知论文

会员服务 ·

0

DNN · Performer · 簇 · Extensibility · 层 ·

2021 年 5 月 28 日

A Sum-of-Ratios Multi-Dimensional-Knapsack Decomposition for DNN Resource Scheduling

翻译：DNN 资源调度的 Ratios 多维多维- Knapsack 解析总和

Menglu Yu,Chuan Wu,Bo Ji,Jia Liu

In recent years, to sustain the resource-intensive computational needs for training deep neural networks (DNNs), it is widely accepted that exploiting the parallelism in large-scale computing clusters is critical for the efficient deployments of DNN training jobs. However, existing resource schedulers for traditional computing clusters are not well suited for DNN training, which results in unsatisfactory job completion time performance. The limitations of these resource scheduling schemes motivate us to propose a new computing cluster resource scheduling framework that is able to leverage the special layered structure of DNN jobs and significantly improve their job completion times. Our contributions in this paper are three-fold: i) We develop a new resource scheduling analytical model by considering DNN's layered structure, which enables us to analytically formulate the resource scheduling optimization problem for DNN training in computing clusters; ii) Based on the proposed performance analytical model, we then develop an efficient resource scheduling algorithm based on the widely adopted parameter-server architecture using a sum-of-ratios multi-dimensional-knapsack decomposition (SMD) method to offer strong performance guarantee; iii) We conduct extensive numerical experiments to demonstrate the effectiveness of the proposed schedule algorithm and its superior performance over the state of the art.

翻译：近年来,为了保持对深神经网络(DNN)培训的资源密集型计算需求,人们普遍认为,利用大规模计算群群群中的平行做法对于高效部署DNN培训工作至关重要,但传统计算群组的现有资源调度员并不完全适合DNN培训,这导致工作完成时间性能不尽人意。这些资源调度计划的局限性促使我们提出一个新的计算群群资源列表框架,能够利用DNN工作的特殊层结构,大大改善他们的工作完成时间。我们在本文件中的贡献有三重:一)我们通过考虑DNN的分层结构,开发一个新的资源调度分析模型,使我们能够对DNN培训在计算群中的资源调度优化问题进行分析性地拟订;二)根据拟议的业绩分析模型,我们随后根据广泛采用的参数-服务器结构,利用一个总和的多维-knappsack解剖法(SMD),开发一个高效的资源调度算法,以提供强有力的绩效保障。

0

相关内容

DNN

【WWW2021】高效的非抽样知识图谱嵌入

专知会员服务

38+阅读 · 2021年4月25日

【ST2020硬核课】深度神经网络，57页ppt

【ST2020硬核课】深度神经网络，57页ppt

专知会员服务

48+阅读 · 2020年8月19日

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

专知会员服务

27+阅读 · 2020年8月6日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【ICLR2020-MIT】元学习的好奇心算法，Meta-learning curiosity algorithms

【ICLR2020-MIT】元学习的好奇心算法，Meta-learning curiosity algorithms

专知会员服务

34+阅读 · 2020年3月13日

【O'Reilly TensorFlow Conference 2019】TensorFlow社区公告（TensorFlow community announcements），Google TensorFlow产品总监Kemal El Moujahid

【O'Reilly TensorFlow Conference 2019】TensorFlow社区公告（TensorFlow community announcements），Google TensorFlow产品总监Kemal El Moujahid

专知会员服务

6+阅读 · 2019年11月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

人工智能 | UAI 2019等国际会议信息4条

人工智能 | UAI 2019等国际会议信息4条

Call4Papers

6+阅读 · 2019年1月14日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

【论文推荐】最新五篇图像分割相关论文—R2U-Net、ScatterNet混合深度学习、分离卷积编解码、控制、Embedding

【论文推荐】最新五篇图像分割相关论文—R2U-Net、ScatterNet混合深度学习、分离卷积编解码、控制、Embedding

专知

7+阅读 · 2018年2月26日

A Novel HOC-Immersed Interface Approach For Elliptic Problems

Arxiv

0+阅读 · 2021年7月20日

Analysis of tensor approximation schemes for continuous functions

Arxiv

0+阅读 · 2021年7月20日

Follow Your Path: a Progressive Method for Knowledge Distillation

Arxiv

0+阅读 · 2021年7月20日

Reward-Weighted Regression Converges to a Global Optimum

Arxiv

0+阅读 · 2021年7月19日

Non-asymptotic estimates for TUSLA algorithm for non-convex learning with applications to neural networks with ReLU activation function

Arxiv

0+阅读 · 2021年7月19日

Priority-Queue based Dynamic Scaling for Efficient Resource Allocation in Fog Computing

Arxiv

0+阅读 · 2021年7月17日

Data-informed Deep Optimization

Arxiv

0+阅读 · 2021年7月17日

Deep Unfolding of Iteratively Reweighted ADMM for Wireless RF Sensing

Arxiv

0+阅读 · 2021年7月15日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

相关VIP内容

【WWW2021】高效的非抽样知识图谱嵌入

专知会员服务

38+阅读 · 2021年4月25日

【ST2020硬核课】深度神经网络，57页ppt

【ST2020硬核课】深度神经网络，57页ppt

专知会员服务

48+阅读 · 2020年8月19日

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

专知会员服务

27+阅读 · 2020年8月6日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【ICLR2020-MIT】元学习的好奇心算法，Meta-learning curiosity algorithms

【ICLR2020-MIT】元学习的好奇心算法，Meta-learning curiosity algorithms

专知会员服务

34+阅读 · 2020年3月13日

【O'Reilly TensorFlow Conference 2019】TensorFlow社区公告（TensorFlow community announcements），Google TensorFlow产品总监Kemal El Moujahid

【O'Reilly TensorFlow Conference 2019】TensorFlow社区公告（TensorFlow community announcements），Google TensorFlow产品总监Kemal El Moujahid

专知会员服务

6+阅读 · 2019年11月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

人工智能 | UAI 2019等国际会议信息4条

人工智能 | UAI 2019等国际会议信息4条

Call4Papers

6+阅读 · 2019年1月14日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

【论文推荐】最新五篇图像分割相关论文—R2U-Net、ScatterNet混合深度学习、分离卷积编解码、控制、Embedding

【论文推荐】最新五篇图像分割相关论文—R2U-Net、ScatterNet混合深度学习、分离卷积编解码、控制、Embedding

专知

7+阅读 · 2018年2月26日

相关论文

A Novel HOC-Immersed Interface Approach For Elliptic Problems

Arxiv

0+阅读 · 2021年7月20日

Analysis of tensor approximation schemes for continuous functions

Arxiv

0+阅读 · 2021年7月20日

Follow Your Path: a Progressive Method for Knowledge Distillation

Arxiv

0+阅读 · 2021年7月20日

Reward-Weighted Regression Converges to a Global Optimum

Arxiv

0+阅读 · 2021年7月19日

Non-asymptotic estimates for TUSLA algorithm for non-convex learning with applications to neural networks with ReLU activation function

Arxiv

0+阅读 · 2021年7月19日

Priority-Queue based Dynamic Scaling for Efficient Resource Allocation in Fog Computing

Arxiv

0+阅读 · 2021年7月17日

Data-informed Deep Optimization

Arxiv

0+阅读 · 2021年7月17日

Deep Unfolding of Iteratively Reweighted ADMM for Wireless RF Sensing

Arxiv

0+阅读 · 2021年7月15日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员