In comparison to classical shallow representation learning techniques, deep neural networks have achieved superior performance in nearly every application benchmark. But despite their clear empirical advantages, it is still not well understood what makes them so effective. To approach this question, we introduce deep frame approximation, a unifying framework for representation learning with structured overcomplete frames. While exact inference requires iterative optimization, it may be approximated by the operations of a feed-forward deep neural network. We then indirectly analyze how model capacity relates to the frame structure induced by architectural hyperparameters such as depth, width, and skip connections. We quantify these structural differences with the deep frame potential, a data-independent measure of coherence linked to representation uniqueness and stability. As a criterion for model selection, we show correlation with generalization error on a variety of common deep network architectures such as ResNets and DenseNets. We also demonstrate how recurrent networks implementing iterative optimization algorithms achieve performance comparable to their feed-forward approximations. This connection to the established theory of overcomplete representations suggests promising new directions for principled deep network architecture design with less reliance on ad-hoc engineering.
翻译:与典型的浅度代表性学习技术相比,深神经网络几乎在每个应用基准中都取得了优异的性能。但是,尽管它们具有明显的实证优势,但仍然不能很好地理解是什么使得它们如此有效。为了解决这一问题,我们引入了深度框架近似,这是代表学习的统一框架,有结构化的超完整框架。虽然精确的推论需要迭代优化,但可能与一个向后进深层神经网络的运作相近。然后,我们间接地分析模型能力与建筑超强参数(如深度、宽度和跳过连接)所引发的框架结构结构结构的关系。我们将这些结构差异与深框架潜力(即与代表的独特性和稳定性相关的数据独立度衡量标准)加以量化。作为选择模型的标准,我们展示了在诸如ResNets和DenseNets等各种共同的深深深层网络结构上与一般错误的相关性。我们还展示了实施迭代优化算算法的经常网络如何取得与它们向前反馈近似的业绩。与既定的超全度表述理论的关联表明,有希望有新的方向,在不那么依赖自动工程的情况下进行有原则的深层网络结构设计。