Classical regression has a simple geometric description in terms of a projection of the training labels onto the column space of the design matrix. However, for over-parameterized models -- where the number of fit parameters is large enough to perfectly fit the training data -- this picture becomes uninformative. Here, we present an alternative geometric interpretation of regression that applies to both under- and over-parameterized models. Unlike the classical picture which takes place in the space of training labels, our new picture resides in the space of input features. This new feature-based perspective provides a natural geometric interpretation of the double-descent phenomenon in the context of bias and variance, explaining why it can occur even in the absence of label noise. Furthermore, we show that adversarial perturbations -- small perturbations to the input features that result in large changes in label values -- are a generic feature of biased models, arising from the underlying geometry. We demonstrate these ideas by analyzing three minimal models for over-parameterized linear least squares regression: without basis functions (input features equal model features) and with linear or nonlinear basis functions (two-layer neural networks with linear or nonlinear activation functions, respectively).
翻译:经典回归在设计矩阵的柱状空间对培训标签的预测方面有一个简单的几何描述。然而,对于过于参数化模型 -- -- 适合参数的数量非常大,足以完全适应培训数据 -- -- 这一图象变得不具有信息规范。在这里,我们展示了一种适用于低度和超度参数化模型的关于回归的替代几何解释。与在培训标签空间中发生的典型图象不同,我们的新图象存在于输入特征的空间中。这种基于地貌的新图象在偏差和差异的情况下为双日现象提供了自然几何解释,解释了为什么即使在没有标签噪音的情况下,也会出现双日形现象。此外,我们显示,对立性扰动 -- -- 导致标签值发生巨大变化的输入特征的微扰动 -- -- 是偏差模型的一般特征,产生于基本的几何测量。我们通过分析以下三个最起码的模型来展示这些想法:没有基础功能(等模特征)以及线性或非线性基本功能(两层线性或非线性运行功能)。