Recent successes of massively overparameterized models have inspired a new line of work investigating the underlying conditions that enable overparameterized models to generalize well. This paper considers a framework where the possibly overparametrized model includes fake features, i.e., features that are present in the model but not in the data. We present a non-asymptotic high-probability bound on the generalization error of the ridge regression problem under the model misspecification of having fake features. Our high-probability results characterize the interplay between the implicit regularization provided by the fake features and the explicit regularization provided by the ridge parameter. We observe that fake features may improve the generalization error, even though they are irrelevant to the data.
翻译:最近大规模超度模型的成功激发了一种新的工作线,调查使超度参数模型能够很好地概括化的基本条件。本文件审议了一个框架,其中可能存在的超度模型包括假特征,即模型中存在的特征,但不在数据中存在。我们提出了一种非被动的高概率,它与模型中山脊回归问题的一般错误有关,而模型中存在假特征的错误有误。我们高概率的结果说明了假特征提供的隐性正规化与山脊参数提供的明确的正规化之间的相互作用。我们观察到,假特征可以改进一般化错误,尽管它们与数据无关。