百日咳共变模式中的渐渐流动:学习曲线和多重世系结构的确切解决办法 (Gradient flow in the gaussian covariate model: exact solution of learning curves and multiple descent structures)

A recent line of work has shown remarkable behaviors of the generalization error curves in simple learning models. Even the least-squares regression has shown atypical features such as the model-wise double descent, and further works have observed triple or multiple descents. Another important characteristic are the epoch-wise descent structures which emerge during training. The observations of model-wise and epoch-wise descents have been analytically derived in limited theoretical settings (such as the random feature model) and are otherwise experimental. In this work, we provide a full and unified analysis of the whole time-evolution of the generalization curve, in the asymptotic large-dimensional regime and under gradient-flow, within a wider theoretical setting stemming from a gaussian covariate model. In particular, we cover most cases already disparately observed in the literature, and also provide examples of the existence of multiple descent structures as a function of a model parameter or time. Furthermore, we show that our theoretical predictions adequately match the learning curves obtained by gradient descent over realistic datasets. Technically we compute averages of rational expressions involving random matrices using recent developments in random matrix theory based on "linear pencils". Another contribution, which is also of independent interest in random matrix theory, is a new derivation of related fixed point equations (and an extension there-off) using Dyson brownian motions.

翻译：最近的一行工作展示了简单学习模型中一般化误差曲线的显著行为。即使是最不光彩的回归也显示了典型的特征,如模型的双向双向下降,进一步的工程也观察到了三或多种下降。另一个重要特征是培训过程中出现的先入为主的血统结构。从分析角度从有限的理论环境(如随机特征模型)中得出了对模型和先入为主的血统的观察,并进行了其他实验。在这项工作中,我们对一般化曲线的整个时间演变、无症状的大维系统和梯度流中,显示了非典型的特征,如模型双向双向双向的双向下降,并观察到了三或多向的下降。另一个重要特征是培训过程中出现的先入为主的世血统结构结构。此外,我们展示了我们的理论预测与在现实数据集中以渐渐渐下降为主的学习曲线完全吻合。技术方面,我们用离子大系统和梯流下流的理性表达方式进行了全面和统一分析,这些表达方式来自一个由高清的理论模型和离差的模型模型模型中的最新矩阵模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日