We analyse the interpolator with minimal $\ell_2$-norm $\hat{\beta}$ in a general high dimensional linear regression framework where $\mathbb Y=\mathbb X\beta^*+\xi$ where $\mathbb X$ is a random $n\times p$ matrix with independent $\mathcal N(0,\Sigma)$ rows and without assumption on the noise vector $\xi\in \mathbb R^n$. We prove that, with high probability, the prediction loss of this estimator is bounded from above by $(\|\beta^*\|^2_2r_{cn}(\Sigma)\vee \|\xi\|^2)/n$, where $r_{k}(\Sigma)=\sum_{i\geq k}\lambda_i(\Sigma)$ are the rests of the sum of eigenvalues of $\Sigma$. These bounds show a transition in the rates. For high signal to noise ratios, the rates $\|\beta^*\|^2_2r_{cn}(\Sigma)/n$ broadly improve the existing ones. For low signal to noise ratio, we also provide lower bound holding with large probability. Under assumptions on the sprectrum of $\Sigma$, this lower bound is of order $\| \xi\|_2^2/n$, matching the upper bound. Consequently, in the large noise regime, we are able to precisely track the prediction error with large probability. This results give new insight when the interpolation can be harmless in high dimensions.
翻译:我们在一个通用高维线性回归框架内以最小值$ ell_ 2$-norm $ hat_beta 美元来分析内置器。 我们证明, 在高维线性回归框架中, 这个估计值的预测值损失来自$( batabb Y ⁇ mathbb X\\ betaxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx