Neural SDEs combine many of the best qualities of both RNNs and SDEs: memory efficient training, high-capacity function approximation, and strong priors on model space. This makes them a natural choice for modelling many types of temporal dynamics. Training a Neural SDE (either as a VAE or as a GAN) requires backpropagating through an SDE solve. This may be done by solving a backwards-in-time SDE whose solution is the desired parameter gradients. However, this has previously suffered from severe speed and accuracy issues, due to high computational cost and numerical truncation errors. Here, we overcome these issues through several technical innovations. First, we introduce the \textit{reversible Heun method}. This is a new SDE solver that is \textit{algebraically reversible}: eliminating numerical gradient errors, and the first such solver of which we are aware. Moreover it requires half as many function evaluations as comparable solvers, giving up to a $1.98\times$ speedup. Second, we introduce the \textit{Brownian Interval}: a new, fast, memory efficient, and exact way of sampling \textit{and reconstructing} Brownian motion. With this we obtain up to a $10.6\times$ speed improvement over previous techniques, which in contrast are both approximate and relatively slow. Third, when specifically training Neural SDEs as GANs (Kidger et al. 2021), we demonstrate how SDE-GANs may be trained through careful weight clipping and choice of activation function. This reduces computational cost (giving up to a $1.87\times$ speedup) and removes the numerical truncation errors associated with gradient penalty. Altogether, we outperform the state-of-the-art by substantial margins, with respect to training speed, and with respect to classification, prediction, and MMD test metrics. We have contributed implementations of all of our techniques to the torchsde library to help facilitate their adoption.
翻译:神经SDE 结合了 RNNS 和 SDE 的许多最佳品质: 记忆高效培训, 高容量功能近似, 以及模型空间的强烈前置。 这使得它们成为模拟许多类型的时间动态的自然选择。 训练神经SDE( 作为 VAE 或 GAN ) 需要通过 SDE 解决方案进行反向再分析。 这可以通过解决一个时间后向的 SDE, 其解决方案是想要的参数梯度。 但是, 此前, 这已经受到由于计算成本高和数字流错误而导致的速度和精确问题。 在这里, 我们通过一些技术创新创新来克服这些问题。 首先, 我们引入了可追溯性SDE SDE 解决方案( 作为 VAE 或作为 GAN ) 。 这是一个新的 SDE 解决方案, 消除数字梯度错误, 以及我们所知道的第一个解决方案。 此外, 需要用许多相关功能来评估作为可比较的解析度, 以1. 98 和 速度来降低 。 其次, 我们引入了第三次的读数 度, 正在更新的 RDREDRDRD 。 。 。 和 。