Recent work showed that there could be a large gap between the classical uniform convergence bound and the actual test error of zero-training-error predictors (interpolators) such as deep neural networks. To better understand this gap, we study the uniform convergence in the nonlinear random feature model and perform a precise theoretical analysis on how uniform convergence depends on the sample size and the number of parameters. We derive and prove analytical expressions for three quantities in this model: 1) classical uniform convergence over norm balls, 2) uniform convergence over interpolators in the norm ball (recently proposed by Zhou et al. (2020)), and 3) the risk of minimum norm interpolator. We show that, in the setting where the classical uniform convergence bound is vacuous (diverges to $\infty$), uniform convergence over the interpolators still gives a non-trivial bound of the test error of interpolating solutions. We also showcase a different setting where classical uniform convergence bound is non-vacuous, but uniform convergence over interpolators can give an improved sample complexity guarantee. Our result provides a first exact comparison between the test errors and uniform convergence bounds for interpolators beyond simple linear models.
翻译:最近的工作表明,传统统一趋同约束与深神经网络等零训练-传感器预测器(内导器)的实际试验错误之间可能存在巨大差距。为了更好地了解这一差距,我们研究了非线性随机特征模型的统一趋同,并对统一趋同如何取决于样本大小和参数数目进行了精确的理论分析。我们得出并证明该模型中三个数量的分析表达方式:1)传统统一趋同于规范球;2)规范球中对中间导体的统一趋同(最近由周等人(202020年)提议)和3)最低规范内导体的风险。我们表明,在经典统一趋同约束是真空的环境下,与中间导体的统一趋同仍然给内调解决办法试验错误带来非三角的束缚。我们还展示了一种不同的环境,即传统的统一趋同结合是非真空的,但是对中间导体的统一趋同可以提供改进的样品复杂性保证。我们的结果为跨极的试验错误和统一线性模型提供了第一次精确的比较。