A fundamental principle of learning theory is that there is a trade-off between the complexity of a prediction rule and its ability to generalize. Modern machine learning models do not obey this paradigm: They produce an accurate prediction even with a perfect fit to the training set. We investigate over-parameterized linear regression models focusing on the minimum norm solution: This is the solution with the minimal norm that attains a perfect fit to the training set. We utilize the recently proposed predictive normalized maximum likelihood (pNML) learner which is the min-max regret solution for the distribution-free setting. We derive an upper bound of this min-max regret which is associated with the prediction uncertainty. We show that if the test sample lies mostly in a subspace spanned by the eigenvectors associated with the large eigenvalues of the empirical correlation matrix of the training data, the model generalizes despite its over-parameterized nature. We demonstrate the use of the pNML regret as a point-wise learnability measure on synthetic data and successfully observe the double-decent phenomenon of the over-parameterized models on UCI datasets.
翻译:一项基本的学习理论原则是,预测规则的复杂性与其普及能力之间存在着权衡。现代机器学习模型没有遵循这一模式:它们产生准确的预测,即使完全适合培训集。我们调查了以最低规范解决方案为重点的超参数线性回归模型:这是最起码规范的解决方案,完全适合培训集。我们使用最近提出的预测性标准化最大可能性(pNML)学习器,这是无分布式设置的最小最大遗憾解决方案。我们从中得出与预测不确定性有关的微最大遗憾的上限。我们显示,如果测试样本主要位于一个亚空间,由与培训数据经验性相关性矩阵的大型电子价值相关的机能化分子所覆盖,尽管模型的特性过于精确。我们证明使用PNML遗憾作为合成数据的一个点对准的学习措施,并成功观察到UCI数据集的超临界模型的双重偏差现象。