过分参数化回归中分量减少、正规化和概括化 (Dimensionality reduction, regularization, and generalization in overparameterized regressions)

Overparameterization in deep learning is powerful: Very large models fit the training data perfectly and yet generalize well. This realization brought back the study of linear models for regression, including ordinary least squares (OLS), which, like deep learning, shows a "double descent" behavior. This involves two features: (1) The risk (out-of-sample prediction error) can grow arbitrarily when the number of samples $n$ approaches the number of parameters $p$, and (2) the risk decreases with $p$ at $p>n$, sometimes achieving a lower value than the lowest risk at $p<n$. The divergence of the risk for OLS at $p\approx n$ is related to the condition number of the empirical covariance in the feature set. For this reason, it can be avoided with regularization. In this work we show that it can also be avoided with a PCA-based dimensionality reduction. We provide a finite upper bound for the risk of the PCA-based estimator. This result is in contrast to recent work that shows that a different form of dimensionality reduction -- one based on the population covariance instead of the empirical covariance -- does not avoid the divergence. We connect these results to an analysis of adversarial attacks, which become more effective as they raise the condition number of the empirical covariance of the features. We show that OLS is arbitrarily susceptible to data-poisoning attacks in the overparameterized regime -- unlike the underparameterized regime -- and that regularization and dimensionality reduction improve the robustness.

翻译：在深层学习中,过度衡量是十分强大的:非常大的模型完全适合培训数据,但又非常笼统。这一实现使线性模型的研究回溯到回归模型,包括普通的最小正方(OLS),这与深层学习一样,表明一种“双向”行为。这涉及两个特点:(1)当样本数量接近参数数量时,风险(无表象预测错误)可能会任意增加(在Sample预测错误),当美元接近参数数量时,以美元计价,以及(2)风险以美元计价,以美元计价,降低风险,有时以美元计价,低于最低风险,以美元计价。OLS在$p<n$方面的风险差异与功能集中的经验差异值有关。为此,可以通过正规化来避免风险。在这项工作中,如果以常设仲裁机构为基础降低维度,则可以避免风险。我们为基于常设仲裁机构估算标准的风险提供了一个有限的上限。这与最近的工作形成对比,表明不同形式的维度减少形式减少,以$p<n$n$为美元计算。一个基于不透明度的 Opallnalityalityalityalityalityality,而不是基于人口常态的常态,从而避免了袭击的不透明性变化,从而使得这些常态性使这些常态性使这些常态与常态与常态与常态性变变变化,我们变化,从而使得这些常态性分析提高了了这些常态性使这些常态性使这些常态性使这些常态数据与常态性分析成为了比。