We consider linear regression problems with a varying number of random projections, where we provably exhibit a double descent curve for a fixed prediction problem, with a high-dimensional analysis based on random matrix theory. We first consider the ridge regression estimator and re-interpret earlier results using classical notions from non-parametric statistics, namely degrees of freedom, also known as effective dimensionality. In particular, we show that the random design performance of ridge regression with a specific regularization parameter matches the classical bias and variance expressions coming from the easier fixed design analysis but for another larger implicit regularization parameter. We then compute asymptotic equivalents of the generalization performance (in terms of bias and variance) of the minimum norm least-squares fit with random projections, providing simple expressions for the double descent phenomenon.
翻译:我们考虑的是线性回归问题,其随机预测数量不尽相同,其中我们可以发现,我们为固定的预测问题展示了双下曲线,根据随机矩阵理论进行高维分析。我们首先考虑山脊回归估计值,并使用非参数统计的经典概念,即自由度(也称为有效维度)来重新解释早期结果。我们特别显示,山脊回归随机设计性能与特定规范参数匹配了传统偏差和差异表达法,这些表达法来自较简单的固定设计分析,而另一个更大的隐含的正规化参数。然后我们计算最低标准最低方形的概括性表现(从偏差和差异角度)与随机预测相匹配,为双重血统现象提供了简单的表达法。</s>