神经普通差异等量 (Neural Ordinary Differential Equations)

We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.

翻译：我们引入了一个由深神经网络模型组成的新组合。我们不指定隐藏层的离散序列,而是使用神经网络对隐藏状态的衍生物进行参数化。网络的输出使用黑盒差异方程式求解器计算。这些连续深度模型具有恒定的内存成本, 根据每个输入量调整其评价策略, 并且可以明确用数字精确度来交换速度。我们在连续深度的残余网络和连续时间潜伏变量模型中展示了这些特性。我们还构建了连续的正常流, 这是一种基因化模型, 可以用最大的可能性来训练, 而不分割或命令数据尺寸。在培训中, 我们展示了如何通过任何 ODE 解算器, 而不使用内部操作。这样就可以在更大的模型中进行终端到终端的 ODE 培训。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【华为-诺亚实验室】动态BERT, Dynamic BERT with Adaptive Width and Depth

专知会员服务

24+阅读 · 2020年4月13日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

55+阅读 · 2020年3月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日