Automatic differentiation (AD) is conventionally understood as a family of distinct algorithms, rooted in two "modes" -- forward and reverse -- which are typically presented (and implemented) separately. Can there be only one? Following up on the AD systems developed in the JAX and Dex projects, we formalize a decomposition of reverse-mode AD into (i) forward-mode AD followed by (ii) unzipping the linear and non-linear parts and then (iii) transposition of the linear part. To that end, we use the technology of linear types to formalize a notion of structurally linear functions, which are then also algebraically linear. Our main results are that forward-mode AD produces structurally linear functions, and that we can unzip and transpose any structurally linear function, conserving cost, size, and structural linearity. Composing these three transformations recovers reverse-mode AD. This decomposition also sheds light on checkpointing, which emerges naturally from a free choice in unzipping let expressions. As a corollary, checkpointing techniques are applicable to general-purpose partial evaluation, not just AD. We hope that our formalization will lead to a deeper understanding of automatic differentiation and that it will simplify implementations, by separating the concerns of differentiation proper from the concerns of gaining efficiency (namely, separating the derivative computation from the act of running it backward).
翻译:通常将自动区分(AD)理解为由两种不同的算法组成的组合,其根基是两种“模式” -- -- 向前和反向 -- -- 通常分别提出(和执行)。能否只有一个?在JAX和Dex项目中开发的AD系统后,我们正式将反向模式自动分解成(一)前向模式自动分解成(一)前向模式自动分解成(二)将线性和非线性部分分解成(三)线性部分的转换。为此,我们利用线性类型的技术正式确定结构线性功能的概念,然后将结构性线性功能也分别提交(和实施)。我们的主要结果是,前向型模式自动模式产生结构线性功能,我们可以将结构性线性功能分解和转换成任何结构性线性功能,节约成本、规模和结构性线性分解,然后(二)将线性部分的线性部分和线性部分分解为非线性部分,然后(三)对线性表达方式的分解过程自然产生分解。作为推论的推论的推论,因此,对结构性技术适用于一般性的部分性部分评价部分性评价,而不是只是向线性线性线性线性线性。我们的主要结果是产生线性功能,我们的主要结果是产生线性功能,我们的主要结果是产生线性功能性功能性功能性功能,我们希望会产生一种分向性功能,我们希望会产生一种分解,我们,我们希望会改变性的分解,我们希望使我们,通过正的分化的分化的分化,我们希望使我们的分解性地使我们的分解。我们希望将我们的分解,通过正的分解,通过正的分解,通过正的分解性的行为。我们希望,我们希望将使我们将使我们的分解,通过正的分解,通过正向的分解。