Where dual-numbers forward-mode automatic differentiation (AD) pairs each scalar value with its tangent value, dual-numbers /reverse-mode/ AD attempts to achieve reverse AD using a similarly simple idea: by pairing each scalar value with a backpropagator function. Its correctness and efficiency on higher-order input languages have been analysed by Brunel, Mazza and Pagani, but this analysis used a custom operational semantics for which it is unclear whether it can be implemented efficiently. We take inspiration from their use of /linear factoring/ to optimise dual-numbers reverse-mode AD to an algorithm that has the correct complexity and enjoys an efficient implementation in a standard functional language with support for mutable arrays, such as Haskell. Aside from the linear factoring ingredient, our optimisation steps consist of well-known ideas from the functional programming community. We demonstrate the practical use of our technique by providing a performant implementation that differentiates most of Haskell98.
翻译:当双数前式自动差异化(AD)配对时,如果双数/反反向模式/AD试图使用一个类似的简单想法实现反向反向反向反向应用:将每个斜度值与后向推进函数配对。Brunel、Mazza和Pagani分析了高阶输入语言的正确性和效率,但这一分析使用了一种定制操作语义,无法有效地加以执行。我们从它们使用/线性因数/优化双数反向模式AD到一种算法的灵感,这种算法具有正确的复杂性,并以标准功能语言有效地实施,支持可变式阵列,例如Haskell。除了线性因数要素外,我们的选择步骤还包括功能性编程社区中众所周知的想法。我们通过提供一种区分大多数Haskell98的性执行方法来展示我们技术的实际用途。