具有最小内存的神经元体精度渐变的症状联合方法 (Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory)

A neural network model of a differential equation, namely neural ODE, has enabled us to learn continuous-time dynamical systems and probabilistic distributions with a high accuracy. It uses the same network repeatedly during a numerical integration. Hence, the backpropagation algorithm requires a memory footprint proportional to the number of uses times the network size. This is true even if a checkpointing scheme divides the computational graph into sub-graphs. Otherwise, the adjoint method obtains a gradient by a numerical integration backward in time with a minimal memory footprint; however, it suffers from numerical errors. This study proposes the symplectic adjoint method, which obtains the exact gradient (up to rounding error) with a footprint proportional to the number of uses plus the network size. The experimental results demonstrate the symplectic adjoint method occupies the smallest footprint in most cases, functions faster in some cases, and is robust to a rounding error among competitive methods.

翻译：神经等式的神经网络模型,即神经元代码,使我们能够以高精度学习连续时间动态系统和概率分布。它在数字集成期间反复使用相同的网络。因此, 反反向调整算法需要与网络规模使用次数成比例的内存足迹。即使一个边检办法将计算图分成子图, 也确实如此。否则, 联合方法通过数字集成获得梯度, 时间倒转, 最小的内存足迹; 然而, 它会受到数字错误的影响。本研究建议采用共振连接法, 获得精确的梯度( 直至圆形错误), 其足迹与使用次数和网络大小成比例。实验结果显示, 共振匹配法在多数情况下占据最小的足迹, 在某些情况下功能更快, 并且强于竞争性方法之间的四舍错。

相关内容

CASES

关注 4

CASES：International Conference on Compilers, Architectures, and Synthesis for Embedded Systems。 Explanation：嵌入式系统编译器、体系结构和综合国际会议。 Publisher：ACM。 SIT： http://dblp.uni-trier.de/db/conf/cases/index.html

【NeurIPS2020】点针图网络，Pointer Graph Networks

专知会员服务

40+阅读 · 2020年9月27日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日