Automatic differentiation (AD) has been a topic of interest for researchers in many disciplines, with increased popularity since its application to machine learning and neural networks. Although many researchers appreciate and know how to apply AD, it remains a challenge to truly understand the underlying processes. From an algebraic point of view, however, AD appears surprisingly natural: it originates from the differentiation laws. In this work we use Algebra of Programming techniques to reason about different AD variants, leveraging Haskell to illustrate our observations. Our findings stem from three fundamental algebraic abstractions: (1) the notion of module over a semiring, (2) Nagata's construction of the 'idealization of a module', and (3) Kronecker's delta function, that together allow us to write a single-line abstract definition of AD. From this single-line definition, and by instantiating our algebraic structures in various ways, we derive different AD variants, that have the same extensional behaviour, but different intensional properties, mainly in terms of (asymptotic) computational complexity. We show the different variants equivalent by means of Kronecker isomorphisms, a further elaboration of our Haskell infrastructure which guarantees correctness by construction. With this framework in place, this paper seeks to make AD variants more comprehensible, taking an algebraic perspective on the matter.
翻译:自动差异( AD) 是许多学科的研究人员感兴趣的话题, 自机器学习和神经网络应用以来, 自动差异( AD) 已经越来越受欢迎。 虽然许多研究人员欣赏并知道如何应用自动, 但对于真正理解基础过程来说仍然是一个挑战。 但是, 从代数角度看, AD 似乎令人惊讶地自然: 它起源于差异法。 在这项工作中, 我们使用编程技术代数来解释不同的AD变量, 利用Haskell 来说明我们的观察。 我们的发现来自三个基本的代数抽象:(1) 模块在半数模型上的概念, (2) Nagata 构建“ 模块化模块” 和(3) Kronecker 的三角体功能, 这使得我们一起能够写出一个单线的AD抽象定义。 从这一单线定义中, 并且通过以不同方式即刻现我们的代数结构结构, 我们从不同的AD变体变体变体模型中得出了相同的扩展行为, 但不同的恒变体特性, 主要是( 模拟) 计算复杂性 。 我们展示不同的变体的变体,, 也就是, 我们的变体, 以 一种可变体 结构 的 的 方式, 以 变体 的 结构 的 选择的 的 的, 的 的 的 的 。