将数字集成代码从CUDA移植到一个API:案例研究 (Porting numerical integration codes from CUDA to oneAPI: a case study) - 专知论文

会员服务 ·

0

CUDA · Integration · Performer · CASE · Use Case ·

2023 年 2 月 17 日

Porting numerical integration codes from CUDA to oneAPI: a case study

翻译：将数字集成代码从CUDA移植到一个API:案例研究

Ioannis Sakiotis,Kamesh Arumugam,Marc Paterno,Desh Ranjan,Balsa Terzic,Mohammad Zubair

We present our experience in porting optimized CUDA implementations to oneAPI. We focus on the use case of numerical integration, particularly the CUDA implementations of PAGANI and $m$-Cubes. We faced several challenges that caused performance degradation in the oneAPI ports. These include differences in utilized registers per thread, compiler optimizations, and mappings of CUDA library calls to oneAPI equivalents. After addressing those challenges, we tested both the PAGANI and m-Cubes integrators on numerous integrands of various characteristics. To evaluate the quality of the ports, we collected performance metrics of the CUDA and oneAPI implementations on the Nvidia V100 GPU. We found that the oneAPI ports often achieve comparable performance to the CUDA versions, and that they are at most 10% slower.

翻译：我们把优化CUDA实施到一个API的经验介绍给一个API。我们侧重于数字整合的使用案例,特别是CUDA实施PAGANI和百万美元立方体的情况。我们面临着导致单一API港口性能退化的若干挑战,其中包括每条线的废旧登记簿、编译器优化以及CUDA图书馆调用单API等量的地图等差异。在应对这些挑战后,我们测试了PAGANI和m-Cubes集成商的多种不同特性的集合体。为了评估港口的质量,我们收集了CUDA和1AIPI实施Nvidia V100GPU的性能指标。我们发现,一个API港口的性能往往与CUDA版本相似,而且速度最多为10%。

0

相关内容

CUDA

【SIGMOD教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

【SIGMOD教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

专知会员服务

11+阅读 · 2022年10月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

专知会员服务

102+阅读 · 2020年6月21日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ICML2019】IanGoodfellow自注意力GAN的代码与PPT

【ICML2019】IanGoodfellow自注意力GAN的代码与PPT

GAN生成式对抗网络

18+阅读 · 2019年6月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

机器学习研究会

12+阅读 · 2018年1月27日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

高冲击韧性Cu-Ni合金低温微观结构演变原位研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于GPU的脉冲星宽带观测的相干消色散研究

国家自然科学基金

0+阅读 · 2013年12月31日

CLIC1在动脉粥样硬化过程内皮细胞损伤与炎症中的作用及丹参酮ⅡA的干预

国家自然科学基金

0+阅读 · 2013年12月31日

Septin7活化Ca2+/CaN/NFAT2信号途径在糖尿病肾病足细胞损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于GH/IGF-1轴糖尿病肾病大鼠Snail 1通路及TEMT的研究

国家自然科学基金

0+阅读 · 2012年12月31日

STAT3在细胞有丝分裂及肺上皮细胞癌变中的作用实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

PhoBR双组份调控系统对胸膜肺炎放线杆菌致病性调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

平面非等周期双光栅系统对物体的消色散成像研究

国家自然科学基金

0+阅读 · 2011年12月31日

人胚胎干细胞来源的Ⅱ型肺泡上皮细胞的免疫原性

国家自然科学基金

0+阅读 · 2011年12月31日

MDM2在乳腺癌上皮细胞间质转化中的作用及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Experimenting with Emerging RISC-V Systems for Decentralised Machine Learning

Arxiv

0+阅读 · 2023年4月8日

Utility maximizing load balancing policies

Arxiv

0+阅读 · 2023年4月7日

Runtime Variation in Big Data Analytics

Arxiv

0+阅读 · 2023年4月7日

A Fully-automatic Side-scan Sonar SLAM Framework

Arxiv

0+阅读 · 2023年4月4日

Automatic Differentiation of Binned Likelihoods With Roofit and Clad

Arxiv

0+阅读 · 2023年4月4日

Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

Arxiv

0+阅读 · 2023年4月4日

Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR

Arxiv

0+阅读 · 2023年3月31日

A Meta-Summary of Challenges in Building Products with ML Components -- Collecting Experiences from 4758+ Practitioners

Arxiv

0+阅读 · 2023年3月31日

Demo Alleviate: Demonstrating Artificial Intelligence Enabled Virtual Assistance for Telehealth: The Mental Health Case

Arxiv

0+阅读 · 2023年3月31日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

VIP会员

文章信息

相关主题

相关VIP内容

【SIGMOD教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

【SIGMOD教程】高效数据标签的众包实践:聚合、增量重标签和定价，附180页slides

专知会员服务

11+阅读 · 2022年10月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

专知会员服务

102+阅读 · 2020年6月21日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ICML2019】IanGoodfellow自注意力GAN的代码与PPT

【ICML2019】IanGoodfellow自注意力GAN的代码与PPT

GAN生成式对抗网络

18+阅读 · 2019年6月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

【推荐】NiftyNet：面向医学图像分析和图像引导治疗的开源CNN平台（附代码）

机器学习研究会

12+阅读 · 2018年1月27日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Experimenting with Emerging RISC-V Systems for Decentralised Machine Learning

Arxiv

0+阅读 · 2023年4月8日

Utility maximizing load balancing policies

Arxiv

0+阅读 · 2023年4月7日

Runtime Variation in Big Data Analytics

Arxiv

0+阅读 · 2023年4月7日

A Fully-automatic Side-scan Sonar SLAM Framework

Arxiv

0+阅读 · 2023年4月4日

Automatic Differentiation of Binned Likelihoods With Roofit and Clad

Arxiv

0+阅读 · 2023年4月4日

Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

Arxiv

0+阅读 · 2023年4月4日

Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR

Arxiv

0+阅读 · 2023年3月31日

A Meta-Summary of Challenges in Building Products with ML Components -- Collecting Experiences from 4758+ Practitioners

Arxiv

0+阅读 · 2023年3月31日

Demo Alleviate: Demonstrating Artificial Intelligence Enabled Virtual Assistance for Telehealth: The Mental Health Case

Arxiv

0+阅读 · 2023年3月31日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

相关基金

高冲击韧性Cu-Ni合金低温微观结构演变原位研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于GPU的脉冲星宽带观测的相干消色散研究

国家自然科学基金

0+阅读 · 2013年12月31日

CLIC1在动脉粥样硬化过程内皮细胞损伤与炎症中的作用及丹参酮ⅡA的干预

国家自然科学基金

0+阅读 · 2013年12月31日

Septin7活化Ca2+/CaN/NFAT2信号途径在糖尿病肾病足细胞损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于GH/IGF-1轴糖尿病肾病大鼠Snail 1通路及TEMT的研究

国家自然科学基金

0+阅读 · 2012年12月31日

STAT3在细胞有丝分裂及肺上皮细胞癌变中的作用实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

PhoBR双组份调控系统对胸膜肺炎放线杆菌致病性调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

平面非等周期双光栅系统对物体的消色散成像研究

国家自然科学基金

0+阅读 · 2011年12月31日

人胚胎干细胞来源的Ⅱ型肺泡上皮细胞的免疫原性

国家自然科学基金

0+阅读 · 2011年12月31日

MDM2在乳腺癌上皮细胞间质转化中的作用及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员