Models like LASSO and ridge regression are extensively used in practice due to their interpretability, ease of use, and strong theoretical guarantees. Cross-validation (CV) is widely used for hyperparameter tuning in these models, but do practical optimization methods minimize the true out-of-sample loss? A recent line of research promises to show that the optimum of the CV loss matches the optimum of the out-of-sample loss (possibly after simple corrections). It remains to show how tractable it is to minimize the CV loss. In the present paper, we show that, in the case of ridge regression, the CV loss may fail to be quasiconvex and thus may have multiple local optima. We can guarantee that the CV loss is quasiconvex in at least one case: when the spectrum of the covariate matrix is nearly flat and the noise in the observed responses is not too high. More generally, we show that quasiconvexity status is independent of many properties of the observed data (response norm, covariate-matrix right singular vectors and singular-value scaling) and has a complex dependence on the few that remain. We empirically confirm our theory using simulated experiments.
翻译:LASSO和山脊回归等模型由于其可解释性、使用方便性以及强有力的理论保障而在实践中被广泛使用。交叉校准(CV)在这些模型中被广泛用于超参数调制,但实际优化方法可以最大限度地减少真实的体外损失?最近的一系列研究承诺表明,CV损失的最佳性与SAPSO和山脊回归的最佳性相匹配(在简单校正之后可能存在)。仍需表明,将CV损失减少到最小程度是多少。在本文件中,我们表明,在山脊回归的情况下,CV损失可能无法成为准电离子,因此可能具有多个局部的奥地性。我们可以保证,CV损失至少在一种情况下是准电流:当COV矩阵的频谱接近平和观察到的应对措施的噪音不高时。更一般地说,我们表明,准电解状态与观察到的数据的许多特性是独立的(对应规范、正变量对准正对准的正向向向向量和奇向值理论),我们使用模拟的实验仍然具有复杂的依赖性。