Computing partial differential equation (PDE) operators via nested backpropagation is expensive, yet popular, and severely restricts their utility for scientific machine learning. Recent advances, like the forward Laplacian and randomizing Taylor mode automatic differentiation (AD), propose forward schemes to address this. We introduce an optimization technique for Taylor mode that 'collapses' derivatives by rewriting the computational graph, and demonstrate how to apply it to general linear PDE operators, and randomized Taylor mode. The modifications simply require propagating a sum up the computational graph, which could -- or should -- be done by a machine learning compiler, without exposing complexity to users. We implement our collapsing procedure and evaluate it on popular PDE operators, confirming it accelerates Taylor mode and outperforms nested backpropagation.
翻译:通过嵌套反向传播计算偏微分方程(PDE)算子的成本高昂,但该方法仍被广泛使用,这严重限制了其在科学机器学习中的实用性。近期进展,如前向拉普拉斯算子和随机化泰勒模式自动微分(AD),提出了前向方案以解决此问题。我们引入了一种针对泰勒模式的优化技术,通过重写计算图来“坍缩”导数,并展示了如何将其应用于一般线性PDE算子及随机化泰勒模式。这些修改仅需沿计算图传播一个求和运算,可由机器学习编译器自动完成,无需向用户暴露复杂性。我们实现了坍缩过程,并在常用PDE算子上进行评估,证实其能加速泰勒模式并优于嵌套反向传播。