HEP 港端 HEP 参数焦量计模拟码对 GPUs 的模拟码 (Porting HEP Parameterized Calorimeter Simulation Code to GPUs) - 专知论文

会员服务 ·

0

FAST · 数据分析 · CUDA · Processing（编程语言） · 英伟达（NVIDIA） ·

2021 年 5 月 18 日

Porting HEP Parameterized Calorimeter Simulation Code to GPUs

翻译：HEP 港端 HEP 参数焦量计模拟码对 GPUs 的模拟码

Zhihua Dong,Heather Gray,Charles Leggett,Meifeng Lin,Vincent R. Pascuzzi,Kwangmin Yu

from arxiv, 15 pages, 1 figure, 8 tables, 2 listings, submitted to Frontiers in Big Data (Big Data in AI and High Energy Physics)

The High Energy Physics (HEP) experiments, such as those at the Large Hadron Collider (LHC), traditionally consume large amounts of CPU cycles for detector simulations and data analysis, but rarely use compute accelerators such as GPUs. As the LHC is upgraded to allow for higher luminosity, resulting in much higher data rates, purely relying on CPUs may not provide enough computing power to support the simulation and data analysis needs. As a proof of concept, we investigate the feasibility of porting a HEP parameterized calorimeter simulation code to GPUs. We have chosen to use FastCaloSim, the ATLAS fast parametrized calorimeter simulation. While FastCaloSim is sufficiently fast such that it does not impose a bottleneck in detector simulations overall, significant speed-ups in the processing of large samples can be achieved from GPU parallelization at both the particle (intra-event) and event levels; this is especially beneficial in conditions expected at the high-luminosity LHC, where extremely high per-event particle multiplicities will result from the many simultaneous proton-proton collisions. We report our experience with porting FastCaloSim to NVIDIA GPUs using CUDA. A preliminary Kokkos implementation of FastCaloSim for portability to other parallel architectures is also described.

翻译：高能物理(HEP)实验,如大型高原相撞机(LHC)的实验,传统上消耗大量CPU周期进行检测模拟和数据分析,但很少使用计算加速器,如GPUs。LHC升级后允许更高的光度,导致数据率高得多,完全依赖CPU可能无法提供足够的计算能力来支持模拟和数据分析需要。作为概念的证明,我们调查将HEP参数化热量计模拟代码移植到GPUs的可行性。我们选择了使用快速CaloSim(ATLAS)快速超模化热量计模拟。虽然快速CaloSim(TalCal)的模拟速度足够快,因此它不会在检测模拟中造成更多的瓶颈,但是在大型样品的处理中,在粒子(事件)和事件级别上都能够实现显著的加速速度。作为概念的证明,我们在高光度LHCHC(高反常数粒度)模拟代码中,我们选择使用ATLLAS(ATLAS)快速准焦度快速焦度模拟模拟。使用GC(C)的快速同步)AA(S)的同步)系统(SUDIS)系统(S)的同步)实施模型(S)初步系统(SUDIS-S)系统(S)系统(S)的同步)系统(PIFUDUDUDUDS)结构,将产生大量(S)其他同步)的同步)的初步报告。

0

相关内容

FAST

FAST：Conference on File and Storage Technologies。 Explanation：文件和存储技术会议。 Publisher：USENIX。 SIT:http://dblp.uni-trier.de/db/conf/fast/

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

专知会员服务

87+阅读 · 2020年5月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

61+阅读 · 2019年12月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Opencv+TF-Slim实现图像分类及深度特征提取

Opencv+TF-Slim实现图像分类及深度特征提取

极市平台

16+阅读 · 2019年8月19日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

Optimising cost vs accuracy of decentralised analytics in fog computing environments

Optimising cost vs accuracy of decentralised analytics in fog computing environments

Arxiv

0+阅读 · 2021年7月9日

Optimum GSSK Transmission in Massive MIMO Systems Using the Box-LASSO Decoder

Optimum GSSK Transmission in Massive MIMO Systems Using the Box-LASSO Decoder

Arxiv

0+阅读 · 2021年7月8日

Direct detection of plasticity onset through total-strain profile evolution

Arxiv

0+阅读 · 2021年7月8日

Conditioned Source Separation for Music Instrument Performances

Arxiv

0+阅读 · 2021年7月7日

A nonBayesian view of Hempel's paradox of the ravens

Arxiv

0+阅读 · 2021年7月7日

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

Arxiv

12+阅读 · 2021年2月21日

Pointer Graph Networks

Pointer Graph Networks

Arxiv

7+阅读 · 2020年6月11日

Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems

Arxiv

7+阅读 · 2020年3月12日

Population Anomaly Detection through Deep Gaussianization

Arxiv

6+阅读 · 2018年5月5日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

Processing（编程语言）

英伟达（NVIDIA）

相关VIP内容

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

【陈天奇】TVM：端到端自动深度学习编译器，244页ppt

专知会员服务

87+阅读 · 2020年5月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

61+阅读 · 2019年12月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

Opencv+TF-Slim实现图像分类及深度特征提取

Opencv+TF-Slim实现图像分类及深度特征提取

极市平台

16+阅读 · 2019年8月19日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

相关论文

Optimising cost vs accuracy of decentralised analytics in fog computing environments

Optimising cost vs accuracy of decentralised analytics in fog computing environments

Arxiv

0+阅读 · 2021年7月9日

Optimum GSSK Transmission in Massive MIMO Systems Using the Box-LASSO Decoder

Optimum GSSK Transmission in Massive MIMO Systems Using the Box-LASSO Decoder

Arxiv

0+阅读 · 2021年7月8日

Direct detection of plasticity onset through total-strain profile evolution

Arxiv

0+阅读 · 2021年7月8日

Conditioned Source Separation for Music Instrument Performances

Arxiv

0+阅读 · 2021年7月7日

A nonBayesian view of Hempel's paradox of the ravens

Arxiv

0+阅读 · 2021年7月7日

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

Arxiv

12+阅读 · 2021年2月21日

Pointer Graph Networks

Pointer Graph Networks

Arxiv

7+阅读 · 2020年6月11日

Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems

Arxiv

7+阅读 · 2020年3月12日

Population Anomaly Detection through Deep Gaussianization

Arxiv

6+阅读 · 2018年5月5日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

微信扫码咨询专知VIP会员