Recent works have demonstrated that increasing model capacity through width in over-parameterized neural networks leads to a decrease in test risk. For neural networks, however, model capacity can also be increased through depth, yet understanding the impact of increasing depth on test risk remains an open question. In this work, we demonstrate that the test risk of over-parameterized convolutional networks is a U-shaped curve (i.e. monotonically decreasing, then increasing) with increasing depth. We first provide empirical evidence for this phenomenon via image classification experiments using both ResNets and the convolutional neural tangent kernel (CNTK). We then present a novel linear regression framework for characterizing the impact of depth on test risk, and show that increasing depth leads to a U-shaped test risk for the linear CNTK. In particular, we prove that the linear CNTK corresponds to a depth-dependent linear transformation on the original space and characterize properties of this transformation. We then analyze over-parameterized linear regression under arbitrary linear transformations and, in simplified settings, provably identify the depths which minimize each of the bias and variance terms of the test risk.
翻译:最近的工程表明,通过超参数神经网络的宽度而增加模型能力会导致测试风险的减少。然而,对于神经网络来说,模型能力也可以通过深度而提高,但了解深度越深对测试风险的影响仍然是一个未决问题。在这项工作中,我们证明,超参数化的卷变网络的试验风险是一个U形曲线(即单体缩小,然后增加),其深度越来越大。我们首先通过图像分类实验,利用ResNets(ResNets)和Culvical神经红心内核(CNTK),为这一现象提供经验证据。然后,我们提出了一个新的线性回归框架,说明深度对测试风险的影响,并表明深度的提高导致线性CNTK(CNTK)的U形测试风险。特别是,我们证明线性CNTK(C)与原始空间的深度依赖线性线变一致,并描述这种变的特性。我们随后在任意线性变换和简化环境中分析过分的线性反差线性回归深度,并准确地确定将试验风险的深度最小化。