基于非同步任务基于任务的运行时系统内部的业绩计量:作为一种应用程序的双白矮人兼并 (Performance Measurements within Asynchronous Task-based Runtime Systems: A Double White Dwarf Merger as an Application) - 专知论文

会员服务 ·

0

Performer · 性能度量 · Performance · 全 · Networking ·

2021 年 4 月 29 日

Performance Measurements within Asynchronous Task-based Runtime Systems: A Double White Dwarf Merger as an Application

翻译：基于非同步任务基于任务的运行时系统内部的业绩计量:作为一种应用程序的双白矮人兼并

Patrick Diehl,Dominic Marcello,Parsa Armini,Hartmut Kaiser,Sagiv Shiber,Geoffrey C. Clayton,Juhan Frank,Gregor Daiß,Dirk Pflüger,David Eder,Alice Koniges,Kevin Huck

Analyzing performance within asynchronous many-task-based runtime systems is challenging because millions of tasks are launched concurrently. Especially for long-term runs the amount of data collected becomes overwhelming. We study HPX and its performance-counter framework and APEX to collect performance data and energy consumption. We added HPX application-specific performance counters to the Octo-Tiger full 3D AMR astrophysics application. This enables the combined visualization of physical and performance data to highlight bottlenecks with respect to different solvers. We examine the overhead introduced by these measurements, which is around 1%, with respect to the overall application runtime. We perform a convergence study for four different levels of refinement and analyze the application's performance with respect to adaptive grid refinement. The measurements' overheads are small, enabling the combined use of performance data and physical properties with the goal of improving the code's performance. All of these measurements were obtained on NERSC's Cori, Louisiana Optical Network Infrastructure's QueenBee2, and Indiana University's Big Red 3.

翻译：在非同步、多任务、多任务运行时间系统中分析性能是困难的,因为同时启动数以百万计的任务。特别是在长期运行的情况下,所收集的数据数量将变得惊人。我们研究了HPX及其性能反射框架和APEX,以收集性能数据和能源消耗情况。我们增加了HPX具体应用性能反向于Octo-Triger全3D AD ATM天体物理学应用。这样,物理和性能数据的综合可视化能够突出不同解答器的瓶颈问题。我们研究了这些测量结果带来的间接费用,在总体应用运行时间方面约为1%。我们对四个不同层次的改进进行了趋同研究,并分析了应用在适应性电网改进方面的绩效。测量的间接费用很小,能够将性能数据和物理特性结合起来使用,从而改进代码的性能。所有这些测量结果都是在NERSC的Cori、路易斯光学网络基础设施的Que Bee2和印第安大学的大红3上取得的。

0

相关内容

Performer

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

197+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

计算机 | USENIX Security 2020等国际会议信息5条

计算机 | USENIX Security 2020等国际会议信息5条

Call4Papers

7+阅读 · 2019年4月25日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Cardinality Minimization, Constraints, and Regularization: A Survey

Arxiv

0+阅读 · 2021年6月17日

Memory Leak Detection Algorithms in the Cloud-based Infrastructure

Arxiv

0+阅读 · 2021年6月16日

The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication

Arxiv

0+阅读 · 2021年6月16日

Introducing Type Properties

Arxiv

0+阅读 · 2021年6月15日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

A Survey on Edge Computing Systems and Tools

Arxiv

36+阅读 · 2019年11月7日

Physical Primitive Decomposition

Physical Primitive Decomposition

Arxiv

4+阅读 · 2018年9月13日

An Ontology-Based Dialogue Management System for Banking and Finance Dialogue Systems

Arxiv

4+阅读 · 2018年4月13日

Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems

Arxiv

3+阅读 · 2018年4月10日

Deep learning and its application to medical image segmentation

Arxiv

6+阅读 · 2018年3月23日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

197+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】面向时间序列基础模型的合成序列符号数据生成方法

军事通信市场七大趋势概述

【CMU博士论文】深度学习中泛化的量化、理解与改进

面向低光照图像增强的扩散模型

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

计算机 | USENIX Security 2020等国际会议信息5条

计算机 | USENIX Security 2020等国际会议信息5条

Call4Papers

7+阅读 · 2019年4月25日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Cardinality Minimization, Constraints, and Regularization: A Survey

Arxiv

0+阅读 · 2021年6月17日

Memory Leak Detection Algorithms in the Cloud-based Infrastructure

Arxiv

0+阅读 · 2021年6月16日

The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication

Arxiv

0+阅读 · 2021年6月16日

Introducing Type Properties

Arxiv

0+阅读 · 2021年6月15日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

A Survey on Edge Computing Systems and Tools

Arxiv

36+阅读 · 2019年11月7日

Physical Primitive Decomposition

Physical Primitive Decomposition

Arxiv

4+阅读 · 2018年9月13日

An Ontology-Based Dialogue Management System for Banking and Finance Dialogue Systems

Arxiv

4+阅读 · 2018年4月13日

Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems

Arxiv

3+阅读 · 2018年4月10日

Deep learning and its application to medical image segmentation

Arxiv

6+阅读 · 2018年3月23日

微信扫码咨询专知VIP会员