This article develops a general theory for minimum norm interpolating estimators and regularized empirical risk minimizers (RERM) in linear models in the presence of additive, potentially adversarial, errors. In particular, no conditions on the errors are imposed. A quantitative bound for the prediction error is given, relating it to the Rademacher complexity of the covariates, the norm of the minimum norm interpolator of the errors and the size of the subdifferential around the true parameter. The general theory is illustrated for Gaussian features and several norms: The $\ell_1$, $\ell_2$, group Lasso and nuclear norms. In case of sparsity or low-rank inducing norms, minimum norm interpolators and RERM yield a prediction error of the order of the average noise level, provided that the overparameterization is at least a logarithmic factor larger than the number of samples and that, in case of RERM, the regularization parameter is small enough. Lower bounds that show near optimality of the results complement the analysis.
翻译:本条为线性模型中线性模型中的估算员和常规化经验风险最小化(RERM)的最小规范间插器和常规化实验最小化风险最小化(RERM)开发了一般性理论,如果存在添加值、潜在对抗性、错误错误,则对错误不附加任何条件。给出了预测错误的定量约束,与共差的Rademacher复杂程度、差错最低规范间插器的规范以及真实参数周围次差大小有关。为高斯特征和若干规范演示了一般理论:$_1美元、$@ell_2美元、群拉索和核规范。如果是随机或低级引导引规范,最低规范间插器和RERM产生平均噪音水平的预测错误,条件是超分计至少是一个比样品数量的对数系数,而在REMRM的情况下,正规化参数也足够小。低的界限显示结果接近最佳性,对分析起到补充作用。