We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression, both in the random and the fixed design settings. Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data. In the fixed design setting, we show that the error is governed by the global complexity of the entire class. In contrast, in random design, ERM may only adapt to simpler models if the local neighborhoods around the regression function are nearly as complex as the class itself, a somewhat counter-intuitive conclusion. We provide sharp lower bounds for performance of ERM for both Donsker and non-Donsker classes. We also discuss our results through the lens of recent studies on interpolation in overparameterized models.
翻译:在随机和固定设计设置中,我们研究了回归任务中经验风险最小化(ERM)程序的最低错误。我们的锐弱下限揭示了适应生成数据的模型简单化的可能性(或不可能性 ) 。 在固定设计设置中,我们显示错误受整个类别全球复杂性的制约。相比之下,在随机设计中,机构风险管理只有在回归功能周围的周边环境与该类别本身几乎一样复杂的情况下,才可能适应更简单的模型,这是一个有点反直觉的结论。我们为唐斯克和非唐斯克级的ERM运行提供了更大幅度的下限。我们还从最近关于过度分化模型的内插研究的角度讨论了我们的结果。