Continuous deep learning architectures enable learning of flexible probabilistic models for predictive modeling as neural ordinary differential equations (ODEs), and for generative modeling as continuous normalizing flows. In this work, we design a framework to decipher the internal dynamics of these continuous depth models by pruning their network architectures. Our empirical results suggest that pruning improves generalization for neural ODEs in generative modeling. We empirically show that the improvement is because pruning helps avoid mode-collapse and flatten the loss surface. Moreover, pruning finds efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy. We hope our results will invigorate further research into the performance-size trade-offs of modern continuous-depth models.
翻译:持续的深层学习结构使得能够学习灵活的概率模型,作为神经普通差异方程式(ODEs)进行预测模型模型,以及作为连续的正常流流进行基因模型模型。在这项工作中,我们设计了一个框架,通过调整网络结构来破解这些连续深度模型的内部动态。我们的经验结果表明,在基因模型中,对神经代码的概括性改进了。我们的经验显示,改进的原因是运行有助于避免模式崩溃和平整损失表面。此外,运行中发现,与原始网络相比,高效的神经极极值显示与原始网络相比减少98%的参数,同时不丧失准确性。我们希望我们的结果能激发对现代连续深度模型的性能大小权衡的进一步研究。