In comparison to classical shallow representation learning techniques, deep neural networks have achieved superior performance in nearly every application benchmark. But despite their clear empirical advantages, it is still not well understood what makes them so effective. To approach this question, we introduce deep frame approximation: a unifying framework for constrained representation learning with structured overcomplete frames. While exact inference requires iterative optimization, it may be approximated by the operations of a feed-forward deep neural network. We indirectly analyze how model capacity relates to frame structures induced by architectural hyperparameters such as depth, width, and skip connections. We quantify these structural differences with the deep frame potential, a data-independent measure of coherence linked to representation uniqueness and stability. As a criterion for model selection, we show correlation with generalization error on a variety of common deep network architectures and datasets. We also demonstrate how recurrent networks implementing iterative optimization algorithms can achieve performance comparable to their feed-forward approximations while improving adversarial robustness. This connection to the established theory of overcomplete representations suggests promising new directions for principled deep network architecture design with less reliance on ad-hoc engineering.
翻译:与古典浅度教学技术相比,深神经网络几乎在每个应用基准中都取得了优异的性能。但是尽管它们具有明显的实证优势,但仍然不能很好地理解是什么使得它们如此有效。为了解决这一问题,我们引入了深度框架近似:一个有结构的超完整框架的限制性代表性学习的统一框架。虽然精确的推论需要迭接优化,但可以用一个向向深神经网络进料到的深度、宽度和跳过连接的网络操作相近。我们间接地分析了模型能力与建筑超强参数引发的框架结构结构(如深度、宽度和跳过连接)之间的关系。我们将这些结构差异与深框架潜力(即与代表的独特性和稳定性相关的数据依赖度衡量标准)加以量化。我们作为选择模型的标准,在各种共同的深度网络架构和数据集中显示与一般化错误的关联性。我们还展示了反复使用迭代优化算法的网络如何在改进对抗向前近似性的同时能够取得与它们相匹配的性。这种与既定的超全度表达理论的理论表明有希望为原则的深层次网络结构设计提供新的方向,而较少依赖自动工程设计。