离比阿斯-变差权衡交易告别? 超分数机器学习理论概述 (A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning)

The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpolating) the training data, which is usually noisy. Such interpolation of noisy data is traditionally associated with detrimental overfitting, and yet a wide range of interpolating models -- from simple linear models to deep neural networks -- have recently been observed to generalize extremely well on fresh test data. Indeed, the recently discovered double descent phenomenon has revealed that highly overparameterized models often improve over the best underparameterized model in test performance. Understanding learning in this overparameterized regime requires new theory and foundational empirical studies, even for the simplest case of the linear model. The underpinnings of this understanding have been laid in very recent analyses of overparameterized linear regression and related statistical learning tasks, which resulted in precise analytic characterizations of double descent. This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective. We emphasize the unique aspects that define the TOPML research area as a subfield of modern ML theory and outline interesting open questions that remain.

翻译：机器学习(ML)的近期快速进展引发了一系列科学问题,对长期的实地教条提出了挑战。最重要的谜题之一是对超度参数模型进行良好的实证概括化。超度参数模型在培训数据集的规模方面过于复杂,导致它们完全适合培训数据(例如,内插),而培训数据通常十分吵闹。这种噪音数据的相互调试传统上与有害的过度配置有关,但最近观察到一系列广泛的开放性图解模型 -- -- 从简单的线性模型到深神经网络 -- -- 已经观察到对新测试数据作了非常广泛的概括化。事实上,最近发现的双度基底模型表明,高度超度模型往往比最佳的透度模型在测试性能中比最佳的透度模型要好。了解这一超度模型的学习需要新的理论和基础经验研究,即使是线性模型的最简单案例也是如此。这种理解的基础在于最近对超度精确的线性回归和相关统计学习任务的分析,这导致对最新双度理论的精确分析性特征分析,作为最新双位模型的精度理论的精细分析,从而解释这些正在形成的统计理论。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/