A common challenge in regression is that for many problems, the degrees of freedom required for a high-quality solution also allows for overfitting. Regularization is a class of strategies that seek to restrict the range of possible solutions so as to discourage overfitting while still enabling good solutions, and different regularization strategies impose different types of restrictions. In this paper, we present a multilevel regularization strategy that constructs and trains a hierarchy of neural networks, each of which has layers that are wider versions of the previous network's layers. We draw intuition and techniques from the field of Algebraic Multigrid (AMG), traditionally used for solving linear and nonlinear systems of equations, and specifically adapt the Full Approximation Scheme (FAS) for nonlinear systems of equations to the problem of deep learning. Training through V-cycles then encourage the neural networks to build a hierarchical understanding of the problem. We refer to this approach as \emph{multilevel-in-width} to distinguish from prior multilevel works which hierarchically alter the depth of neural networks. The resulting approach is a highly flexible framework that can be applied to a variety of layer types, which we demonstrate with both fully-connected and convolutional layers. We experimentally show with PDE regression problems that our multilevel training approach is an effective regularizer, improving the generalize performance of the neural networks studied.
翻译:回归的共同挑战是,对于许多问题而言,高质量解决方案所需的自由程度也允许过度适应。正规化是一系列战略,试图限制可能的解决办法的范围,以抑制超配,同时仍然能够提供良好的解决方案,不同的正规化战略规定了不同类型的限制。在本文件中,我们提出了一个多层次正规化战略,建立和训练神经网络的等级,每个网络的层次都具有较广的上一个网络层次。我们从过去用于解决线性和非线性方程式系统的传统高格列(AMG)领域(AMG)中提取直觉和技术,具体地将非线性方程式系统的全面整合计划(FAS)与深层学习问题相适应。通过V周期的培训鼓励神经网络建立对问题的等级理解。我们称之为“meph{multi-le-width} ”,以区分先前的多层次工作,后者按等级改变神经网络的深度。由此形成的一种高度灵活的框架,可以适用于非线性方方程式系统,以适应深层学习的问题。通过常规的系统进行培训,我们与一系列的正级化研究展示了我们共同的不断升级的层次。