The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpolating) the training data, which is usually noisy. Such interpolation of noisy data is traditionally associated with detrimental overfitting, and yet a wide range of interpolating models -- from simple linear models to deep neural networks -- have recently been observed to generalize extremely well on fresh test data. Indeed, the recently discovered double descent phenomenon has revealed that highly overparameterized models often improve over the best underparameterized model in test performance. Understanding learning in this overparameterized regime requires new theory and foundational empirical studies, even for the simplest case of the linear model. The underpinnings of this understanding have been laid in very recent analyses of overparameterized linear regression and related statistical learning tasks, which resulted in precise analytic characterizations of double descent. This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective. We emphasize the unique aspects that define the TOPML research area as a subfield of modern ML theory and outline interesting open questions that remain.
翻译:机器学习(ML)的近期快速进展引发了一系列科学问题,对长期的实地教条提出了挑战。最重要的谜题之一是对超度参数模型进行良好的实证概括化。超度参数模型在培训数据集的规模方面过于复杂,导致它们完全适合培训数据(例如,内插),而培训数据通常十分吵闹。这种噪音数据的相互调试传统上与有害的过度配置有关,但最近观察到一系列广泛的开放性图解模型 -- -- 从简单的线性模型到深神经网络 -- -- 已经观察到对新测试数据作了非常广泛的概括化。事实上,最近发现的双度基底模型表明,高度超度模型往往比最佳的透度模型在测试性能中比最佳的透度模型要好。了解这一超度模型的学习需要新的理论和基础经验研究,即使是线性模型的最简单案例也是如此。这种理解的基础在于最近对超度精确的线性回归和相关统计学习任务的分析,这导致对最新双度理论的精确分析性特征分析,作为最新双位模型的精度理论的精细分析,从而解释这些正在形成的统计理论。