Where dual-numbers forward-mode automatic differentiation (AD) pairs each scalar value with its tangent derivative, dual-numbers /reverse-mode/ AD attempts to achieve reverse AD using a similarly simple idea: by pairing each scalar value with a backpropagator function. Its correctness and efficiency on higher-order input languages have been analysed by Brunel, Mazza and Pagani, but this analysis was on a custom operational semantics for which it is unclear whether it can be implemented efficiently. We take inspiration from their use of /linear factoring/ to optimise dual-numbers reverse-mode AD to an algorithm that has the correct complexity and enjoys an efficient implementation in a standard functional language with resource-linear types, such as Haskell. Aside from the linear factoring ingredient, our optimisation steps consist of well-known ideas from the functional programming community. Furthermore, we observe a connection with classical imperative taping-based reverse AD, as well as Kmett's 'ad' Haskell library, recently analysed by Krawiec et al. We demonstrate the practical use of our technique by providing a performant implementation that differentiates most of Haskell98.
翻译:当双数前序自动差异化(AD)配对时,双数/反反序模式/AD试图利用一个类似的简单想法实现反向反向反向反向应用:将每个斜弧值与后序推进功能对齐。Brunel、Mazza和Pagani分析了高序输入语言的正确性和效率,但这一分析是针对一个定制操作语义的语义学,它能否有效加以执行尚不清楚。我们从它们使用/线性因数/优化双序数字反向模式AD到一种算法的灵感,这种算法具有正确的复杂性,并以一种标准的功能语言与资源-线性类型(如Haskell)高效地执行。除了线性因数要素外,我们的选择步骤还包括功能性编程社区广为人知的想法。此外,我们观察到了与传统需要的反向倾斜调法的联系,以及Kmettt'ad Haskell图书馆,最近由Krawic 和 AL 分析。我们展示了我们技术的实际应用情况,我们通过这种技术的区别。