DARM: 用于减少SIMT线索变异的控制光流熔化 (DARM: Control-Flow Melding for SIMT Thread Divergence Reduction) - 专知论文

会员服务 ·

0

散度 · 可约的 · Branch · Performer · 相似度 ·

2021 年 12 月 18 日

DARM: Control-Flow Melding for SIMT Thread Divergence Reduction

翻译：DARM: 用于减少SIMT线索变异的控制光流熔化

Charitha Saumya,Kirshanthan Sundararajah,Milind Kulkarni

GPGPUs use the Single-Instruction-Multiple-Thread (SIMT) execution model where a group of threads-wavefront or warp-execute instructions in lockstep. When threads in a group encounter a branching instruction, not all threads in the group take the same path, a phenomenon known as control-flow divergence. The control-flow divergence causes performance degradation because both paths of the branch must be executed one after the other. Prior research has primarily addressed this issue through architectural modifications. We observe that certain GPGPU kernels with control-flow divergence have similar control-flow structures with similar instructions on both sides of a branch. This structure can be exploited to reduce control-flow divergence by melding the two sides of the branch allowing threads to reconverge early, reducing divergence. In this work, we present DARM, a compiler analysis and transformation framework that can meld divergent control-flow structures with similar instruction sequences. We show that DARM can reduce the performance degradation from control-flow divergence.

翻译：GPGPPP 使用单一指示- 多元- 轨迹( SIMT) 执行模式, 即一组线条- 波浪前或曲速- Excute 指令在锁定处使用。当一个组的线条遇到分支指令时, 不是全部线条都走同一路径, 这种现象被称为控制流差异。控制流差异导致性能退化, 因为分支的两条路径必须执行一个又一个。先前的研究主要通过建筑修改来解决这个问题。我们观察到, 某些控制流差异的GPGPUPU内核有类似的控制流结构, 分支两侧都有类似的指令。这个结构可以被利用来减少控制流差异, 其方法是将分支两侧的线线条混合, 以早期重新配置, 减少差异。在这项工作中, 我们提出 DARM, 一个编译器分析和转换框架, 能够将不同的控制流结构与类似的指令序列混合。我们表明 DARM 可以减少控制流差异的性能降解。

0

相关内容

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知会员服务

74+阅读 · 2021年1月10日

NLP必读经典文献100篇

专知会员服务

123+阅读 · 2020年9月8日

【机器伦理学综述论文，37页pdf】Implementations in Machine Ethics: A Survey

专知会员服务

12+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

35+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

270+阅读 · 2019年10月9日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

9+阅读 · 2019年8月9日

人工智能 | NIPS 2019等国际会议信息8条

人工智能 | NIPS 2019等国际会议信息8条

Call4Papers

7+阅读 · 2019年3月21日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

15+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

Learning Latent Actions without Human Demonstrations

Arxiv

0+阅读 · 2022年2月22日

Improving Systematic Generalization Through Modularity and Augmentation

Arxiv

0+阅读 · 2022年2月22日

Adaptive time step control for infinitesimal multirate methods

Arxiv

0+阅读 · 2022年2月21日

A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning

Arxiv

0+阅读 · 2022年2月21日

Confidence and discoveries with e-values

Arxiv

0+阅读 · 2022年2月21日

CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery

Arxiv

0+阅读 · 2022年2月20日

System Safety and Artificial Intelligence

System Safety and Artificial Intelligence

Arxiv

0+阅读 · 2022年2月18日

Intrusion Prevention through Optimal Stopping

Arxiv

0+阅读 · 2022年2月17日

Contrastive Active Inference

Arxiv

4+阅读 · 2021年10月19日

The Measure of Intelligence

The Measure of Intelligence

Arxiv

6+阅读 · 2019年11月5日

VIP会员

文章信息

相关主题

相关VIP内容

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知会员服务

74+阅读 · 2021年1月10日

NLP必读经典文献100篇

专知会员服务

123+阅读 · 2020年9月8日

【机器伦理学综述论文，37页pdf】Implementations in Machine Ethics: A Survey

专知会员服务

12+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

167+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

35+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

64+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

270+阅读 · 2019年10月9日

热门VIP内容

相关资讯

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

9+阅读 · 2019年8月9日

人工智能 | NIPS 2019等国际会议信息8条

人工智能 | NIPS 2019等国际会议信息8条

Call4Papers

7+阅读 · 2019年3月21日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

15+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

Learning Latent Actions without Human Demonstrations

Arxiv

0+阅读 · 2022年2月22日

Improving Systematic Generalization Through Modularity and Augmentation

Arxiv

0+阅读 · 2022年2月22日

Adaptive time step control for infinitesimal multirate methods

Arxiv

0+阅读 · 2022年2月21日

A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning

Arxiv

0+阅读 · 2022年2月21日

Confidence and discoveries with e-values

Arxiv

0+阅读 · 2022年2月21日

CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery

Arxiv

0+阅读 · 2022年2月20日

System Safety and Artificial Intelligence

System Safety and Artificial Intelligence

Arxiv

0+阅读 · 2022年2月18日

Intrusion Prevention through Optimal Stopping

Arxiv

0+阅读 · 2022年2月17日

Contrastive Active Inference

Arxiv

4+阅读 · 2021年10月19日

The Measure of Intelligence

The Measure of Intelligence

Arxiv

6+阅读 · 2019年11月5日

微信扫码咨询专知VIP会员