Neural SDEs combine many of the best qualities of both RNNs and SDEs, and as such are a natural choice for modelling many types of temporal dynamics. They offer memory efficiency, high-capacity function approximation, and strong priors on model space. Neural SDEs may be trained as VAEs or as GANs; in either case it is necessary to backpropagate through the SDE solve. In particular this may be done by constructing a backwards-in-time SDE whose solution is the desired parameter gradients. However, this has previously suffered from severe speed and accuracy issues, due to high computational complexity, numerical errors in the SDE solve, and the cost of reconstructing Brownian motion. Here, we make several technical innovations to overcome these issues. First, we introduce the reversible Heun method: a new SDE solver that is algebraically reversible -- which reduces numerical gradient errors to almost zero, improving several test metrics by substantial margins over state-of-the-art. Moreover it requires half as many function evaluations as comparable solvers, giving up to a $1.98\times$ speedup. Next, we introduce the Brownian interval. This is a new and computationally efficient way of exactly sampling and reconstructing Brownian motion; this is in contrast to previous reconstruction techniques that are both approximate and relatively slow. This gives up to a $10.6\times$ speed improvement over previous techniques. After that, when specifically training Neural SDEs as GANs (Kidger et al. 2021), we demonstrate how SDE-GANs may be trained through careful weight clipping and choice of activation function. This reduces computational cost (giving up to a $1.87\times$ speedup), and removes the truncation errors of the double adjoint required for gradient penalty, substantially improving several test metrics. Altogether these techniques offer substantial improvements over the state-of-the-art.
翻译:神经SDE 结合了RNN 和 SDE 的许多最佳品质, 因而是模拟许多类型的时间动态的自然选择。 它们提供了存储效率、 高容量功能近似和模型空间的强烈前科。 神经SDE 可能被培训为 VAEs 或 GANs ; 在两种情况下, 都需要通过 SDE 解析来进行反演。 特别是, 可以通过构建一个后向时间SDE 的 SDE, 其解决方案是想要的参数梯度。 但是, 在此之前, 这对于模拟许多时间的速率和精确度都存在严重的问题。 由于计算复杂性高, SDE 解算中的数字错误, 以及重建布朗运动的成本。 我们在这里做了一些技术创新来克服这些问题。 首先, 我们引入了可逆的 Heun 方法: 一个新的 SDE 解算器, 它的变数值可以降低到几乎为零, 将数个测试量值改进到州值的显著的差值, 。 此外, 还需要有一半的功能评估作为可比较的解决方案的解决方案的解算,, 向198\ 时间的变变变速度的校算法, 正在 进行一次的校验算 。 。 这个前的校算的比 。