This paper explores the generalization loss of linear regression in variably parameterized families of models, both under-parameterized and over-parameterized. We show that the generalization curve can have an arbitrary number of peaks, and moreover, locations of those peaks can be explicitly controlled. Our results highlight the fact that both classical U-shaped generalization curve and the recently observed double descent curve are not intrinsic properties of the model family. Instead, their emergence is due to the interaction between the properties of the data and the inductive biases of learning algorithms.
翻译:本文探讨了模型可变参数化的模型组合中线性回归的普及性丧失,这些模型组合包括参数度不足和参数过大的模型。我们表明,一般化曲线可以任意设定峰值数量,此外,这些峰值的位置可以明确控制。我们的结果突出表明,古典U形概括化曲线和最近观察到的双向下降曲线都不是模型组合的内在特性。相反,它们的出现是由于数据特性与学习算法的诱导偏向之间的相互作用。