大规模平面波赝势密度泛函理论的异构计算算法研究 - 专知基金

会员服务 ·

0

GPU · 大规模并行 ·

2012 年 12 月 31 日

大规模平面波赝势密度泛函理论的异构计算算法研究

国家自然科学基金

国家自然科学基金委员会

项目名称： 大规模平面波赝势密度泛函理论的异构计算算法研究

项目编号： No.61202054

项目类型： 青年科学基金项目

立项/批准年度： 2013

项目学科： 计算机科学学科

项目作者： 王龙

作者单位： 中国科学院计算机网络信息中心

项目金额： 23万元

中文摘要： 平面波赝势密度泛函计算是材料科学模拟中使用最广泛的一类方法，其软件在材料科学中有着举足轻重的地位。我们在前期工作中初步实现了基于该方法的多GPU加速软件：SC_PEtot。据我们所知，这是世界上第一个能扩展到多个GPU上的平面波赝势密度泛函软件，它相比CPU版本约快10倍，比领域内最风行的商业软件VASP快5倍。在本申请中，我们力争对典型的512-1000原子体系，将SC_PEtot的速度翻番，实现约20倍加速。在材料科学的GPU集群模拟中，这将是一个远高于同类研究加速比的新纪录。我们将通过：1）将计算完全移入GPU中进行；2）采用能降低MPI通信的新算法；3）采用新的CPU/GPU数值库等手段实现这一目标。同时我们会构建定量的性能分析模型，对不同物理体系和计算资源预测其计算时间，这将有助于理解异构计算中的瓶颈所在。我们也会尝试将上述研究推广到不同主/协处理器配比的异构系统中。

中文关键词： GPU；密度泛函理论；大规模并行；；

英文摘要： Plane wave pseudopotential (PWP) density functional theory (DFT) calculation is the most widely used material science simulation, and the DFT-PWP codes are arguably the most important material science codes. We have implemented a DFT-PWP code SC_PEtot on a multi-node GPU machine. As far as we know, this is the first code scalable to large number of CPU/GPU computing units, and the GPU version can have a ~10 times speed-up over the CPU version and is ~5 times faster than the legendary VASP code.In this project, we want to achieved ~2 times speed-up over the old GPU code for a typical 512-1000 atoms system. Such speedup is much higher than other similar works for this important class of material simulation codes on GPU clusters. We plan to move the calculation fully into the GPU, adopt a new algorithm to reduce the MPI communication, and use new GPU and CPU numerical libraries. We also want to provide a detail analysis of the performance, a quantitative model for the computational times for different physical systems and number of GPU units. Such model can be used to understand the challenges and bottlenecks of the DFT-PWP simulations on heterogeneous machines.We will also extend the heterogenous computing algorithms to heterogeneous system combined with different CPU/Coprocessor configurations.

英文关键词： GPU；DFT；massive parallel；；

成为VIP会员查看完整内容

0

相关内容

GPU

高性能计算专家Jack Dongarra获2021年图灵奖

高性能计算专家Jack Dongarra获2021年图灵奖

专知会员服务

17+阅读 · 2022年3月30日

《人工智能芯片基准测试评估方法》行业标准

《人工智能芯片基准测试评估方法》行业标准

专知会员服务

86+阅读 · 2022年2月20日

【博士论文】分形计算系统

【博士论文】分形计算系统

专知会员服务

37+阅读 · 2021年12月9日

【博士论文】基于冲量的加速优化算法

【博士论文】基于冲量的加速优化算法

专知会员服务

28+阅读 · 2021年11月29日

【NeurIPS2021】黑箱学习算法的信息理论泛化界

专知会员服务

23+阅读 · 2021年10月6日

【干货书】数值优化，683页pdf

专知会员服务

108+阅读 · 2021年8月23日

【伯利克博士论文】深度学习应用的算法、硬件和调度的协同设计，161页pdf

【伯利克博士论文】深度学习应用的算法、硬件和调度的协同设计，161页pdf

专知会员服务

76+阅读 · 2021年8月18日

【ICML2021】密度约束强化学习

专知会员服务

22+阅读 · 2021年6月26日

【博士论文】解耦合的类脑计算系统栈设计

【博士论文】解耦合的类脑计算系统栈设计

专知会员服务

32+阅读 · 2020年12月14日

【CVPR 2020-商汤】8比特数值也能训练卷积神经网络模型

【CVPR 2020-商汤】8比特数值也能训练卷积神经网络模型

专知会员服务

26+阅读 · 2020年5月7日

2021图灵奖公布！高性能计算先驱Jack Dongarra获奖

2021图灵奖公布！高性能计算先驱Jack Dongarra获奖

微软研究院AI头条

0+阅读 · 2022年3月31日

超算榜单TOP500创始人之一Jack Dongarra荣获图灵奖！高性能计算领域首次得奖

超算榜单TOP500创始人之一Jack Dongarra荣获图灵奖！高性能计算领域首次得奖

新智元

0+阅读 · 2022年3月31日

2021年图灵奖公布！高性能计算先驱Jack Dongarra获奖

2021年图灵奖公布！高性能计算先驱Jack Dongarra获奖

AI前线

0+阅读 · 2022年3月31日

【博士论文】分形计算系统

【博士论文】分形计算系统

专知

3+阅读 · 2021年12月9日

郑纬民：AI 和 HPC 融合的高性能计算机体系结构

郑纬民：AI 和 HPC 融合的高性能计算机体系结构

THU数据派

5+阅读 · 2021年11月22日

199元定律

人人都是产品经理

0+阅读 · 2021年10月14日

没计算资源？白嫖5000核时CPU/GPU的机会来了。

没计算资源？白嫖5000核时CPU/GPU的机会来了。

图与推荐

1+阅读 · 2021年9月24日

借助新的物理模拟引擎加速强化学习

借助新的物理模拟引擎加速强化学习

TensorFlow

1+阅读 · 2021年8月16日

AI芯片发展现状及前景分析

AI芯片发展现状及前景分析

专知

1+阅读 · 2021年5月2日

多贝西小波密度泛函的高效并行计算及电荷体系应用

国家自然科学基金

1+阅读 · 2014年12月31日

基于GPU的脉冲星宽带观测的相干消色散研究

国家自然科学基金

0+阅读 · 2013年12月31日

云环境下基于BSP模型的大规模不动点迭代计算研究

国家自然科学基金

0+阅读 · 2013年12月31日

多GPU并行的热/化学反应非平衡N-S方程求解算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

弹性应变下原子薄膜的电子结构和输运性质的理论和计算研究

国家自然科学基金

0+阅读 · 2012年12月31日

CPU/GPU异构平台下并行保结构算法的研究

国家自然科学基金

2+阅读 · 2012年12月31日

逆向中子输运问题的数值方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于可重构计算技术的暂态稳定性实时计算方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

多元逼近的贪婪算法与量子算法

国家自然科学基金

0+阅读 · 2009年12月31日

面向混合体系结构的先进并行算法研究

国家自然科学基金

1+阅读 · 2009年12月31日

NTIRE 2022 Challenge on Stereo Image Super-Resolution: Methods and Results

Arxiv

0+阅读 · 2022年4月20日

Approximating Persistent Homology for Large Datasets

Arxiv

0+阅读 · 2022年4月19日

Graph-theoretic algorithms for Kolmogorov operators: Approximating solutions and their gradients in elliptic and parabolic problems on manifolds

Arxiv

0+阅读 · 2022年4月19日

The maximum likelihood degree of sparse polynomial systems

The maximum likelihood degree of sparse polynomial systems

Arxiv

0+阅读 · 2022年4月19日

An Upwind Generalized Finite Difference Method for Meshless Solution of Two-phase Porous Flow Equations

An Upwind Generalized Finite Difference Method for Meshless Solution of Two-phase Porous Flow Equations

Arxiv

0+阅读 · 2022年4月18日

Testing Symmetry for Bivariate Copulas using Bernstein Polynomials

Arxiv

0+阅读 · 2022年4月18日

Numerical computation of the equilibrium-reduced density matrix for strongly coupled open quantum systems

Arxiv

0+阅读 · 2022年4月18日

Learning time-dependent PDE solver using Message Passing Graph Neural Networks

Arxiv

0+阅读 · 2022年4月15日

A Numerical Scheme for Wave Turbulence: 3-Wave Kinetic Equations

Arxiv

0+阅读 · 2022年4月15日

Retrieve-then-extract Based Knowledge Graph Querying Using Graph Neural Networks

Arxiv

1+阅读 · 2022年4月15日

阅读: 0 点赞: 0

小贴士

登录享主题订阅及个性化推荐

相关主题

大规模并行

热门VIP内容

开通专知VIP会员享更多权益服务

【斯坦福博士论文】面向地理空间数据的多模态与多尺度建模：时空生成式人工智能

多模态大语言模型下游调优中“保持自我”的重要性

《创新与适应性作为军事成功的关键因素：来自俄乌战争的战略洞见》报告

AI智能体时代中的记忆：形式、功能与动态综述

相关VIP内容

高性能计算专家Jack Dongarra获2021年图灵奖

高性能计算专家Jack Dongarra获2021年图灵奖

专知会员服务

17+阅读 · 2022年3月30日

《人工智能芯片基准测试评估方法》行业标准

《人工智能芯片基准测试评估方法》行业标准

专知会员服务

86+阅读 · 2022年2月20日

【博士论文】分形计算系统

【博士论文】分形计算系统

专知会员服务

37+阅读 · 2021年12月9日

【博士论文】基于冲量的加速优化算法

【博士论文】基于冲量的加速优化算法

专知会员服务

28+阅读 · 2021年11月29日

【NeurIPS2021】黑箱学习算法的信息理论泛化界

专知会员服务

23+阅读 · 2021年10月6日

【干货书】数值优化，683页pdf

专知会员服务

108+阅读 · 2021年8月23日

【伯利克博士论文】深度学习应用的算法、硬件和调度的协同设计，161页pdf

【伯利克博士论文】深度学习应用的算法、硬件和调度的协同设计，161页pdf

专知会员服务

76+阅读 · 2021年8月18日

【ICML2021】密度约束强化学习

专知会员服务

22+阅读 · 2021年6月26日

【博士论文】解耦合的类脑计算系统栈设计

【博士论文】解耦合的类脑计算系统栈设计

专知会员服务

32+阅读 · 2020年12月14日

【CVPR 2020-商汤】8比特数值也能训练卷积神经网络模型

【CVPR 2020-商汤】8比特数值也能训练卷积神经网络模型

专知会员服务

26+阅读 · 2020年5月7日

相关资讯

2021图灵奖公布！高性能计算先驱Jack Dongarra获奖

2021图灵奖公布！高性能计算先驱Jack Dongarra获奖

微软研究院AI头条

0+阅读 · 2022年3月31日

超算榜单TOP500创始人之一Jack Dongarra荣获图灵奖！高性能计算领域首次得奖

超算榜单TOP500创始人之一Jack Dongarra荣获图灵奖！高性能计算领域首次得奖

新智元

0+阅读 · 2022年3月31日

2021年图灵奖公布！高性能计算先驱Jack Dongarra获奖

2021年图灵奖公布！高性能计算先驱Jack Dongarra获奖

AI前线

0+阅读 · 2022年3月31日

【博士论文】分形计算系统

【博士论文】分形计算系统

专知

3+阅读 · 2021年12月9日

郑纬民：AI 和 HPC 融合的高性能计算机体系结构

郑纬民：AI 和 HPC 融合的高性能计算机体系结构

THU数据派

5+阅读 · 2021年11月22日

199元定律

人人都是产品经理

0+阅读 · 2021年10月14日

没计算资源？白嫖5000核时CPU/GPU的机会来了。

没计算资源？白嫖5000核时CPU/GPU的机会来了。

图与推荐

1+阅读 · 2021年9月24日

借助新的物理模拟引擎加速强化学习

借助新的物理模拟引擎加速强化学习

TensorFlow

1+阅读 · 2021年8月16日

AI芯片发展现状及前景分析

AI芯片发展现状及前景分析

专知

1+阅读 · 2021年5月2日

相关基金

多贝西小波密度泛函的高效并行计算及电荷体系应用

国家自然科学基金

1+阅读 · 2014年12月31日

基于GPU的脉冲星宽带观测的相干消色散研究

国家自然科学基金

0+阅读 · 2013年12月31日

云环境下基于BSP模型的大规模不动点迭代计算研究

国家自然科学基金

0+阅读 · 2013年12月31日

多GPU并行的热/化学反应非平衡N-S方程求解算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

弹性应变下原子薄膜的电子结构和输运性质的理论和计算研究

国家自然科学基金

0+阅读 · 2012年12月31日

CPU/GPU异构平台下并行保结构算法的研究

国家自然科学基金

2+阅读 · 2012年12月31日

逆向中子输运问题的数值方法研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于可重构计算技术的暂态稳定性实时计算方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

多元逼近的贪婪算法与量子算法

国家自然科学基金

0+阅读 · 2009年12月31日

面向混合体系结构的先进并行算法研究

国家自然科学基金

1+阅读 · 2009年12月31日

相关论文

NTIRE 2022 Challenge on Stereo Image Super-Resolution: Methods and Results

Arxiv

0+阅读 · 2022年4月20日

Approximating Persistent Homology for Large Datasets

Arxiv

0+阅读 · 2022年4月19日

Graph-theoretic algorithms for Kolmogorov operators: Approximating solutions and their gradients in elliptic and parabolic problems on manifolds

Arxiv

0+阅读 · 2022年4月19日

The maximum likelihood degree of sparse polynomial systems

The maximum likelihood degree of sparse polynomial systems

Arxiv

0+阅读 · 2022年4月19日

An Upwind Generalized Finite Difference Method for Meshless Solution of Two-phase Porous Flow Equations

An Upwind Generalized Finite Difference Method for Meshless Solution of Two-phase Porous Flow Equations

Arxiv

0+阅读 · 2022年4月18日

Testing Symmetry for Bivariate Copulas using Bernstein Polynomials

Arxiv

0+阅读 · 2022年4月18日

Numerical computation of the equilibrium-reduced density matrix for strongly coupled open quantum systems

Arxiv

0+阅读 · 2022年4月18日

Learning time-dependent PDE solver using Message Passing Graph Neural Networks

Arxiv

0+阅读 · 2022年4月15日

A Numerical Scheme for Wave Turbulence: 3-Wave Kinetic Equations

Arxiv

0+阅读 · 2022年4月15日

Retrieve-then-extract Based Knowledge Graph Querying Using Graph Neural Networks

Arxiv

1+阅读 · 2022年4月15日

微信扫码咨询专知VIP会员