自动微分：利用RooFit和Clad对分组似然函数的微分 (Automatic Differentiation of Binned Likelihoods With Roofit and Clad)

RooFit is a toolkit for statistical modeling and fitting used by most experiments in particle physics. Just as data sets from next-generation experiments grow, processing requirements for physics analysis become more computationally demanding, necessitating performance optimizations for RooFit. One possibility to speed-up minimization and add stability is the use of Automatic Differentiation (AD). Unlike for numerical differentiation, the computation cost scales linearly with the number of parameters, making AD particularly appealing for statistical models with many parameters. In this paper, we report on one possible way to implement AD in RooFit. Our approach is to add a facility to generate C++ code for a full RooFit model automatically. Unlike the original RooFit model, this generated code is free of virtual function calls and other RooFit-specific overhead. In particular, this code is then used to produce the gradient automatically with Clad. Clad is a source transformation AD tool implemented as a plugin to the clang compiler, which automatically generates the derivative code for input C++ functions. We show results demonstrating the improvements observed when applying this code generation strategy to HistFactory and other commonly used RooFit models. HistFactory is the subcomponent of RooFit that implements binned likelihood models with probability densities based on histogram templates. These models frequently have a very large number of free parameters and are thus an interesting first target for AD support in RooFit.

翻译：RooFit是用于统计建模和拟合的工具包，被粒子物理中大多数实验使用。随着下一代实验数据集的增长，物理分析的处理要求变得越来越具有计算要求，需要对RooFit进行性能优化。其中一种可能的加速最小化和增加稳定性的方法是使用自动微分（AD）。与数值微分不同，计算成本随参数数量线性缩放，使AD特别适用于具有许多参数的统计模型。在本文中，我们报告了在RooFit中实现AD的一种可能方法。我们的方法是添加一个自动生成C++代码的工具，用于完整的RooFit模型。与原始的RooFit模型不同，此生成代码不包含虚函数调用和其他RooFit特定的开销。特别地，该代码然后用于使用Clad自动产生梯度。Clad是作为clang编译器插件实现的源变换AD工具，可为输入的C++函数自动生成导数代码。我们展示了将此代码生成策略应用于HistFactory和其他常用的RooFit模型时观察到的改进结果。HistFactory是RooFit的一个子组件，它实现了基于直方图模板的概率密度的柱形似然模型。这些模型通常具有非常多的自由参数，因此是RooFit中首要支持AD的有趣目标。

相关内容

自动微分

关注 4

在数学和计算机代数中，自动微分有时称作演算式微分，是一种可以借由计算机程序计算一个函数导数的方法。两种传统做微分的方法为：（1）对一个函数的表示式做符号上的微分，并且计算其在某一点上的值。（2）使用差分。使用符号微分最主要的缺点是速度慢及将计算机程序转换成表示式的困难。此外，很多函数在要计算更高阶微分时会变得复杂。使用差分的两个重要的缺点是舍弃误差及数值化过程和相消误差。此两者传统方法在计算更高阶微分时，都有复杂度及误差增加的问题。自动微分则解决上述的问题。

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日