实时分析地理分布的超级计算机实验科学工作量 (Toward Real-time Analysis of Experimental Science Workloads on Geographically Distributed Supercomputers) - 专知论文

会员服务 ·

0

Integration · TOOLS · 缩放 · 边 · API ·

2021 年 5 月 13 日

Toward Real-time Analysis of Experimental Science Workloads on Geographically Distributed Supercomputers

翻译：实时分析地理分布的超级计算机实验科学工作量

Michael Salim,Thomas Uram,J. Taylor Childers,Venkat Vishwanath,Michael E. Papka

Massive upgrades to science infrastructure are driving data velocities upwards while stimulating adoption of increasingly data-intensive analytics. While next-generation exascale supercomputers promise strong support for I/O-intensive workflows, HPC remains largely untapped by live experiments, because data transfers and disparate batch-queueing policies are prohibitive when faced with scarce instrument time. To bridge this divide, we introduce Balsam: a distributed orchestration platform enabling workflows at the edge to securely and efficiently trigger analytics tasks across a user-managed federation of HPC execution sites. We describe the architecture of the Balsam service, which provides a workflow management API, and distributed sites that provision resources and schedule scalable, fault-tolerant execution. We demonstrate Balsam in efficiently scaling real-time analytics from two DOE light sources simultaneously onto three supercomputers (Theta, Summit, and Cori), while maintaining low overheads for on-demand computing, and providing a Python library for seamless integration with existing ecosystems of data analysis tools.

翻译：科学基础设施的大规模升级正在推动数据速度的上升,同时刺激采用日益数据密集型分析方法。虽然下一代的先进超级计算机承诺大力支持I/O密集型工作流程,但HPC仍然基本上没有被现场实验利用,因为面对紧缺的仪器时间,数据传输和分散的批量排解政策令人望而却步。为了弥合这一鸿沟,我们引入了Balsam:一个分布式的管弦平台,使边缘的工作流程能够安全有效地触发由用户管理的HPC执行点联合会之间的分析任务。我们描述了提供工作流程管理API的Balsam服务结构,以及提供资源和可缩放、容错执行时间表的分布点。我们展示了Balsam在将两个DOE光源的实时分析器同时推广到三个超级计算机(Theta、Celead和Cori),同时保持低按需计算间接费用,并为数据分析工具与现有生态系统的无缝合而提供Python图书馆。

0

相关内容

Integration

Integration：Integration, the VLSI Journal。 Explanation：集成，VLSI杂志。 Publisher：Elsevier。 SIT：http://dblp.uni-trier.de/db/journals/integration/

【2020新书】Python文本分析，104页pdf

【2020新书】Python文本分析，104页pdf

专知会员服务

100+阅读 · 2020年12月23日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

经济学中的数据科学，Data Science in Economics，附22页pdf

经济学中的数据科学，Data Science in Economics，附22页pdf

专知会员服务

36+阅读 · 2020年4月1日

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

专知会员服务

80+阅读 · 2020年3月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

专知会员服务

24+阅读 · 2019年5月14日

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

人工智能 | ACCV 2020等国际会议信息5条

人工智能 | ACCV 2020等国际会议信息5条

Call4Papers

6+阅读 · 2019年6月21日

计算机 | ICDE 2020等国际会议信息8条

计算机 | ICDE 2020等国际会议信息8条

Call4Papers

3+阅读 · 2019年5月24日

CCF C类 | DSAA 2019 诚邀稿件

CCF C类 | DSAA 2019 诚邀稿件

Call4Papers

6+阅读 · 2019年5月13日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机 | CCF推荐会议信息10条

计算机 | CCF推荐会议信息10条

Call4Papers

5+阅读 · 2018年10月18日

机器人开发库软件大列表

机器人开发库软件大列表

专知

10+阅读 · 2018年3月18日

Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture

Arxiv

0+阅读 · 2021年7月6日

Disparate Impact Diminishes Consumer Trust Even for Advantaged Users

Arxiv

0+阅读 · 2021年7月5日

Recent Advancements In Distributed System Communications

Arxiv

0+阅读 · 2021年7月3日

A scalable spectral Stokes solver for simulation of time-periodic flows in complex geometries

Arxiv

0+阅读 · 2021年7月2日

From Zero to Fog: Efficient Engineering of Fog-Based Internet of Things Applications

Arxiv

0+阅读 · 2021年7月2日

Privacy in Distributed Computations based on Real Number Secret Sharing

Arxiv

0+阅读 · 2021年7月2日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

Graph Neural Networks: A Review of Methods and Applications

Graph Neural Networks: A Review of Methods and Applications

Arxiv

5+阅读 · 2019年7月10日

Semantics of Data Mining Services in Cloud Computing

Semantics of Data Mining Services in Cloud Computing

Arxiv

4+阅读 · 2018年10月5日

Big Data: Understanding Big Data

Arxiv

6+阅读 · 2016年1月15日

VIP会员

文章信息

相关主题

相关VIP内容

【2020新书】Python文本分析，104页pdf

【2020新书】Python文本分析，104页pdf

专知会员服务

100+阅读 · 2020年12月23日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

经济学中的数据科学，Data Science in Economics，附22页pdf

经济学中的数据科学，Data Science in Economics，附22页pdf

专知会员服务

36+阅读 · 2020年4月1日

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

专知会员服务

80+阅读 · 2020年3月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

【AAMSA 2019 | tutorial】多智能体系统中的认知推理Epistemic Reasoning In Multiagent Systems ,法国雷恩François Schwarzentruber

专知会员服务

24+阅读 · 2019年5月14日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

人工智能赋能自主武器与人类控制第一部分：人类控制与机器学习的设计和开发 | 46页

军事指挥控制系统：2025年5种用途

人工智能赋能自主武器与人类控制第二部分：人类控制与军事指挥官 | 38页

相关资讯

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

人工智能 | ACCV 2020等国际会议信息5条

人工智能 | ACCV 2020等国际会议信息5条

Call4Papers

6+阅读 · 2019年6月21日

计算机 | ICDE 2020等国际会议信息8条

计算机 | ICDE 2020等国际会议信息8条

Call4Papers

3+阅读 · 2019年5月24日

CCF C类 | DSAA 2019 诚邀稿件

CCF C类 | DSAA 2019 诚邀稿件

Call4Papers

6+阅读 · 2019年5月13日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机 | CCF推荐会议信息10条

计算机 | CCF推荐会议信息10条

Call4Papers

5+阅读 · 2018年10月18日

机器人开发库软件大列表

机器人开发库软件大列表

专知

10+阅读 · 2018年3月18日

相关论文

Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture

Arxiv

0+阅读 · 2021年7月6日

Disparate Impact Diminishes Consumer Trust Even for Advantaged Users

Arxiv

0+阅读 · 2021年7月5日

Recent Advancements In Distributed System Communications

Arxiv

0+阅读 · 2021年7月3日

A scalable spectral Stokes solver for simulation of time-periodic flows in complex geometries

Arxiv

0+阅读 · 2021年7月2日

From Zero to Fog: Efficient Engineering of Fog-Based Internet of Things Applications

Arxiv

0+阅读 · 2021年7月2日

Privacy in Distributed Computations based on Real Number Secret Sharing

Arxiv

0+阅读 · 2021年7月2日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

Graph Neural Networks: A Review of Methods and Applications

Graph Neural Networks: A Review of Methods and Applications

Arxiv

5+阅读 · 2019年7月10日

Semantics of Data Mining Services in Cloud Computing

Semantics of Data Mining Services in Cloud Computing

Arxiv

4+阅读 · 2018年10月5日

Big Data: Understanding Big Data

Arxiv

6+阅读 · 2016年1月15日

微信扫码咨询专知VIP会员