FT-GCR:一个有过错容忍性的普遍同化残余椭圆体求解器 (FT-GCR: a fault-tolerant generalized conjugate residual elliptic solver) - 专知论文

会员服务 ·

0

共轭 · Performer · Sphering · 容差 · Integration ·

2021 年 3 月 21 日

FT-GCR: a fault-tolerant generalized conjugate residual elliptic solver

翻译：FT-GCR:一个有过错容忍性的普遍同化残余椭圆体求解器

Mike Gillard,Tommaso Benacchio

With the steady advance of high performance computing systems featuring smaller and smaller hardware components, the systems and algorithms used for numerical simulations increasingly contend with disruptions caused by hardware failures and bit-levels misrepresentations of computing data. In numerical frameworks exploiting massive processing power, the solution of linear systems often represents the most computationally intensive component. Given the large amount of repeated operations involved, iterative solvers are particularly vulnerable to bit-flips. A new method named FT-GCR is proposed here that supplies the preconditioned Generalized Conjugate Residual Krylov solver with detection of, and recovery from, soft faults. The algorithm tests on the monotonic decrease of the residual norm and, upon failure, restarts the iteration within the local Krylov space. Numerical experiments on the solution of an elliptic problem arising from a stationary flow over an isolated hill on the sphere show the skill of the method in addressing bit-flips on a range of grid sizes and data loss scenarios, with best returns and detection rates obtained for larger corruption events. The simplicity of the method makes it easily extendable to other solvers and an ideal candidate for algorithmic fault tolerance within integrated model resilience strategies.

翻译：随着高性能计算系统以较小硬件组件为主的稳步推进,用于数字模拟的系统和算法日益与硬件故障和计算数据的比特级误差造成的干扰相抗衡。在利用大规模处理能力的数字框架中,线性系统的解决方案往往代表着最计算密集的部分。鉴于涉及的大量重复操作,迭代求解器特别容易受到位翻的伤害。在此建议采用称为FT-GCR的新方法,为通用通用Conjugate剩余Krylov软件提供先决条件的通用Conjugate剩余Krylov软件,以探测和从软故障中回收。该方法的简单化使得其剩余规范的单调降法测试,一旦失败,将重新启动本地Krylov空间的迭代。关于因地表上一个偏僻的山上的固定流动而产生的椭圆问题的解决方案的数值实验显示了在一系列网格大小和数据损失假设中处理小滴问题的方法的技巧,为更大的腐败事件获得最佳回报和检测率。该方法的简单化使其易于推广到其他解算器中,并且成为了一种理想的抗错能力。

0

相关内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

ICLR2021放榜了！ 687篇入选34篇得满分！ 48篇orals，108篇spotlights，531篇poster

ICLR2021放榜了！ 687篇入选34篇得满分！ 48篇orals，108篇spotlights，531篇poster

专知会员服务

24+阅读 · 2021年1月13日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

生成式对抗网络先验贝叶斯推断，Bayesian Inference with Generative Adversarial Network Priors

生成式对抗网络先验贝叶斯推断，Bayesian Inference with Generative Adversarial Network Priors

专知会员服务

28+阅读 · 2020年2月18日

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

专知会员服务

240+阅读 · 2020年1月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

计算机 | ICDE 2020等国际会议信息8条

计算机 | ICDE 2020等国际会议信息8条

Call4Papers

3+阅读 · 2019年5月24日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

人工智能 | 国际会议信息6条

人工智能 | 国际会议信息6条

Call4Papers

5+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution

Arxiv

0+阅读 · 2021年5月14日

An Accelerated Newton-Dinkelbach Method and its Application to Two Variables Per Inequality Systems

Arxiv

0+阅读 · 2021年5月14日

Fast Stencil Computations using Fast Fourier Transforms

Arxiv

0+阅读 · 2021年5月14日

Loosely-self-stabilizing Byzantine-tolerant Binary Consensus for Signature-free Message-passing Systems

Arxiv

0+阅读 · 2021年5月13日

Robust Beamforming Design and Time Allocation for IRS-assisted Wireless Powered Communication Networks

Robust Beamforming Design and Time Allocation for IRS-assisted Wireless Powered Communication Networks

Arxiv

0+阅读 · 2021年5月13日

Efficient executions of Pipelined Conjugate Gradient Method on Heterogeneous Architectures

Arxiv

0+阅读 · 2021年5月13日

Automated Dynamic Mechanism Design

Arxiv

0+阅读 · 2021年5月13日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Scalable Generalized Dynamic Topic Models

Arxiv

7+阅读 · 2018年3月21日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

ICLR2021放榜了！ 687篇入选34篇得满分！ 48篇orals，108篇spotlights，531篇poster

ICLR2021放榜了！ 687篇入选34篇得满分！ 48篇orals，108篇spotlights，531篇poster

专知会员服务

24+阅读 · 2021年1月13日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

生成式对抗网络先验贝叶斯推断，Bayesian Inference with Generative Adversarial Network Priors

生成式对抗网络先验贝叶斯推断，Bayesian Inference with Generative Adversarial Network Priors

专知会员服务

28+阅读 · 2020年2月18日

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

康奈尔大学Jon Kleinberg经典书《算法设计Algorithm Design》课件PPT与电子书，864页pdf

专知会员服务

240+阅读 · 2020年1月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

Deep Research（深度研究）：系统性综述

《革新战术战场空间能力：反无人机系统》报告

【普林斯顿博士论文】用于语音的生成式通用模型

螺旋式开发作为战略资产：美军启示

相关资讯

计算机 | ICDE 2020等国际会议信息8条

计算机 | ICDE 2020等国际会议信息8条

Call4Papers

3+阅读 · 2019年5月24日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

人工智能 | 国际会议信息6条

人工智能 | 国际会议信息6条

Call4Papers

5+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Sparse Nonnegative Convolution Is Equivalent to Dense Nonnegative Convolution

Arxiv

0+阅读 · 2021年5月14日

An Accelerated Newton-Dinkelbach Method and its Application to Two Variables Per Inequality Systems

Arxiv

0+阅读 · 2021年5月14日

Fast Stencil Computations using Fast Fourier Transforms

Arxiv

0+阅读 · 2021年5月14日

Loosely-self-stabilizing Byzantine-tolerant Binary Consensus for Signature-free Message-passing Systems

Arxiv

0+阅读 · 2021年5月13日

Robust Beamforming Design and Time Allocation for IRS-assisted Wireless Powered Communication Networks

Robust Beamforming Design and Time Allocation for IRS-assisted Wireless Powered Communication Networks

Arxiv

0+阅读 · 2021年5月13日

Efficient executions of Pipelined Conjugate Gradient Method on Heterogeneous Architectures

Arxiv

0+阅读 · 2021年5月13日

Automated Dynamic Mechanism Design

Arxiv

0+阅读 · 2021年5月13日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Scalable Generalized Dynamic Topic Models

Arxiv

7+阅读 · 2018年3月21日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员