Continuous deep learning architectures enable learning of flexible probabilistic models for predictive modeling as neural ordinary differential equations (ODEs), and for generative modeling as continuous normalizing flows. In this work, we design a framework to decipher the internal dynamics of these continuous depth models by pruning their network architectures. Our empirical results suggest that pruning improves generalization for neural ODEs in generative modeling. Moreover, pruning finds minimal and efficient neural ODE representations with up to 98\% less parameters compared to the original network, without loss of accuracy. Finally, we show that by applying pruning we can obtain insightful information about the design of better neural ODEs.We hope our results will invigorate further research into the performance-size trade-offs of modern continuous-depth models.
翻译:持续的深层学习结构有助于学习灵活的概率模型,作为神经普通差异方程式(ODEs)进行预测模型模型,作为神经普通差异方程式(ODEs)和作为连续正常流的基因模型。在这项工作中,我们设计了一个框架,通过修剪网络结构来破解这些连续深度模型的内部动态。我们的经验结果表明,在基因模型中,对神经代码的概括性改进了运行。此外,修剪发现与原始网络相比,最小和高效的神经极值显示的参数比原网络少98 ⁇ %,而没有丧失准确性。最后,我们表明,通过应用修剪剪,我们可以获得关于更好的神经极值模型设计的有洞察力的信息。我们希望我们的成果将激励对现代连续深度模型的性能大小权衡的进一步研究。