DaCe AD：统一机器学习和科学计算的高性能自动微分 (DaCe AD: Unifying High-Performance Automatic Differentiation for Machine Learning and Scientific Computing)

Automatic differentiation (AD) is a set of techniques that systematically applies the chain rule to compute the gradients of functions without requiring human intervention. Although the fundamentals of this technology were established decades ago, it is experiencing a renaissance as it plays a key role in efficiently computing gradients for backpropagation in machine learning algorithms. AD is also crucial for many applications in scientific computing domains, particularly emerging techniques that integrate machine learning models within scientific simulations and schemes. Existing AD frameworks have four main limitations: limited support of programming languages, requiring code modifications for AD compatibility, limited performance on scientific computing codes, and a naive store-all solution for forward-pass data required for gradient calculations. These limitations force domain scientists to manually compute the gradients for large problems. This work presents DaCe AD, a general, efficient automatic differentiation engine that requires no code modifications. DaCe AD uses a novel ILP-based algorithm to optimize the trade-off between storing and recomputing to achieve maximum performance within a given memory constraint. We showcase the generality of our method by applying it to NPBench, a suite of HPC benchmarks with diverse scientific computing patterns, where we outperform JAX, a Python framework with state-of-the-art general AD capabilities, by more than 92 times on average without requiring any code changes.

翻译：自动微分（AD）是一套系统应用链式法则计算函数梯度而无需人工干预的技术。尽管该技术的基本原理在数十年前就已确立，但因其在机器学习算法反向传播中高效计算梯度的关键作用，正经历复兴。AD对于科学计算领域的许多应用同样至关重要，尤其是在科学模拟与方案中集成机器学习模型的新兴技术。现有AD框架存在四个主要局限：编程语言支持有限、需要修改代码以实现AD兼容性、在科学计算代码上性能受限，以及为梯度计算所需的前向传播数据采用简单的全存储方案。这些局限迫使领域科学家针对大规模问题手动计算梯度。本研究提出DaCe AD，一种无需代码修改的通用高效自动微分引擎。DaCe AD采用基于整数线性规划（ILP）的新颖算法，在存储与重计算之间进行优化权衡，以在给定内存约束下实现最大性能。我们通过将其应用于NPBench（一套包含多样化科学计算模式的高性能计算基准测试套件）展示了方法的通用性，在无需任何代码修改的情况下，平均性能超越具有最先进通用AD能力的Python框架JAX达92倍以上。

相关内容

Machine Learning

关注 2245

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日