以任务为基础的运行时系统先进同步技术 (Advanced Synchronization Techniques for Task-based Runtime Systems) - 专知论文

会员服务 ·

0

Performer · Better · MoDELS · Performance · 分解的 ·

2021 年 5 月 17 日

Advanced Synchronization Techniques for Task-based Runtime Systems

翻译：以任务为基础的运行时系统先进同步技术

David Álvarez,Kevin Sala,Marcos Maroñas,Aleix Roca,Vicenç Beltran

from arxiv, 14 pages, 11 figures. Published in the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'21)

Task-based programming models like OmpSs-2 and OpenMP provide a flexible data-flow execution model to exploit dynamic, irregular and nested parallelism. Providing an efficient implementation that scales well with small granularity tasks remains a challenge, and bottlenecks can manifest in several runtime components. In this paper, we analyze the limiting factors in the scalability of a task-based runtime system and propose individual solutions for each of the challenges, including a wait-free dependency system and a novel scalable scheduler design based on delegation. We evaluate how the optimizations impact the overall performance of the runtime, both individually and in combination. We also compare the resulting runtime against state of the art OpenMP implementations, showing equivalent or better performance, especially for fine-grained tasks.

翻译：基于任务的编程模式,如OmpSS-2和OpenMP,提供了灵活的数据流执行模式,以利用动态、非常规和嵌套的平行模式。提供高效率的执行,使小型颗粒性任务得到很好的规模,这仍然是一项挑战,瓶颈可在几个运行阶段显现出来。在本文件中,我们分析了基于任务的运行时间系统的可缩放性方面的限制因素,并为每一项挑战提出了个别的解决办法,包括无等待依赖系统和基于授权的新颖的可缩放的排程设计。我们评估了优化如何影响运行时间的总体性能,无论是单独还是组合。我们还比较了由此产生的运行时间与基于任务的运行时间相比,显示相当或更好的业绩,特别是细微任务。

0

相关内容

Performer

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

专知会员服务

103+阅读 · 2020年2月1日

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

专知会员服务

11+阅读 · 2020年1月17日

【Freddy Lecue博士】Thales嵌入式可解释AI：关键系统中AI的采用（Thales Embedded Explainable AI: Towards the Adoption of AI in Critical Systems.），AI Accelerator Summit 2019

【Freddy Lecue博士】Thales嵌入式可解释AI：关键系统中AI的采用（Thales Embedded Explainable AI: Towards the Adoption of AI in Critical Systems.），AI Accelerator Summit 2019

专知会员服务

21+阅读 · 2019年11月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

Open-Source LiDAR Time Synchronization System by Mimicking GPS-clock

Open-Source LiDAR Time Synchronization System by Mimicking GPS-clock

Arxiv

0+阅读 · 2021年7月6日

Sustaining Performance While Reducing Energy Consumption: A Control Theory Approach

Arxiv

0+阅读 · 2021年7月6日

Energy Forecasting in Smart Grid Systems: A Review of the State-of-the-art Techniques

Arxiv

0+阅读 · 2021年7月6日

Combination of Multiple Global Descriptors for Image Retrieval

Combination of Multiple Global Descriptors for Image Retrieval

Arxiv

3+阅读 · 2019年4月18日

Collaborative Similarity Embedding for Recommender Systems

Arxiv

8+阅读 · 2019年2月19日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

Human Interaction with Recommendation Systems

Arxiv

6+阅读 · 2018年3月28日

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Arxiv

3+阅读 · 2018年3月13日

Improving Object Localization with Fitness NMS and Bounded IoU Loss

Arxiv

4+阅读 · 2017年11月8日

A Survey on Multi-Task Learning

Arxiv

5+阅读 · 2017年7月25日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

专知会员服务

103+阅读 · 2020年2月1日

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

专知会员服务

11+阅读 · 2020年1月17日

【Freddy Lecue博士】Thales嵌入式可解释AI：关键系统中AI的采用（Thales Embedded Explainable AI: Towards the Adoption of AI in Critical Systems.），AI Accelerator Summit 2019

【Freddy Lecue博士】Thales嵌入式可解释AI：关键系统中AI的采用（Thales Embedded Explainable AI: Towards the Adoption of AI in Critical Systems.），AI Accelerator Summit 2019

专知会员服务

21+阅读 · 2019年11月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Open-Source LiDAR Time Synchronization System by Mimicking GPS-clock

Open-Source LiDAR Time Synchronization System by Mimicking GPS-clock

Arxiv

0+阅读 · 2021年7月6日

Sustaining Performance While Reducing Energy Consumption: A Control Theory Approach

Arxiv

0+阅读 · 2021年7月6日

Energy Forecasting in Smart Grid Systems: A Review of the State-of-the-art Techniques

Arxiv

0+阅读 · 2021年7月6日

Combination of Multiple Global Descriptors for Image Retrieval

Combination of Multiple Global Descriptors for Image Retrieval

Arxiv

3+阅读 · 2019年4月18日

Collaborative Similarity Embedding for Recommender Systems

Arxiv

8+阅读 · 2019年2月19日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

Human Interaction with Recommendation Systems

Arxiv

6+阅读 · 2018年3月28日

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Arxiv

3+阅读 · 2018年3月13日

Improving Object Localization with Fitness NMS and Bounded IoU Loss

Arxiv

4+阅读 · 2017年11月8日

A Survey on Multi-Task Learning

Arxiv

5+阅读 · 2017年7月25日

微信扫码咨询专知VIP会员