Automatic Differentiation (AD) is instrumental for science and industry. It is a tool to evaluate the derivative of a function specified through a computer program. The range of AD application domain spans from Machine Learning to Robotics to High Energy Physics. Computing gradients with the help of AD is guaranteed to be more precise than the numerical alternative and have a low, constant factor more arithmetical operations compared to the original function. Moreover, AD applications to domain problems typically are computationally bound. They are often limited by the computational requirements of high-dimensional parameters and thus can benefit from parallel implementations on graphics processing units (GPUs). Clad aims to enable differential analysis for C/C++ and CUDA and is a compiler-assisted AD tool available both as a compiler extension and in ROOT. Moreover, Clad works as a plugin extending the Clang compiler; as a plugin extending the interactive interpreter Cling; and as a Jupyter kernel extension based on xeus-cling. We demonstrate the advantages of parallel gradient computations on GPUs with Clad. We explain how to bring forth a new layer of optimization and a proportional speed up by extending Clad to support CUDA. The gradients of well-behaved C++ functions can be automatically executed on a GPU. The library can be easily integrated into existing frameworks or used interactively. Furthermore, we demonstrate the achieved application performance improvements, including (~10x) in ROOT histogram fitting and corresponding performance gains from offloading to GPUs.
翻译:自动差异( AD) 有助于科学和产业。 它是一个工具, 用于评估计算机程序指定函数的衍生物 。 AD 应用域的范围从机器学习到机器人到机器人到高能物理。 在 AD 帮助下计算梯度的保证比数字替代物更精确, 并且比原始函数的计算操作更低、 恒定系数。 此外, AD 应用于域问题通常在计算上受高维参数的计算要求限制, 因而可以从图形处理器( GPUs) 的平行执行中受益。 Clad 旨在为 C/ C++ 和 CUDA 提供差异分析, 并且是一个编译器辅助的自动工具, 作为编译器扩展Clang 编译器的插件; 作为扩展交互式解释器 Cling 的插件; 以及 以xeus- cloging 为基础, 域域内的问题。 我们展示了在图形处理器处理器处理器( G/ C+UDA ) 的平行计算方法的好处。 我们解释如何从一个新的层次, 以及一个快速的 C- 快速化 CA+UD 运行运行的运行, 可以通过 CLA 演示到现有的升级框架, 和按比例地显示, 和升级的升级的升级的运行。