The linear regression model cannot be fitted to high-dimensional data, as the high-dimensionality brings about empirical non-identifiability. Penalized regression overcomes this non-identifiability by augmentation of the loss function by a penalty (i.e. a function of regression coefficients). The ridge penalty is the sum of squared regression coefficients, giving rise to ridge regression. Here many aspect of ridge regression are reviewed e.g. moments, mean squared error, its equivalence to constrained estimation, and its relation to Bayesian regression. Finally, its behaviour and use are illustrated in simulation and on omics data. Subsequently, ridge regression is generalized to allow for a more general penalty. The ridge penalization framework is then translated to logistic regression and its properties are shown to carry over. To contrast ridge penalized estimation, the final chapter introduces its lasso counterpart.
翻译:线性回归模型无法与高维数据相适应,因为高维值带来了经验性的非可识别性。 刑事回归通过增加损失函数的处罚( 回归系数的函数) 克服了这种不可识别性。 脊峰惩罚是平方回归系数的总和, 导致脊峰回归。 这里对脊回归的许多方面进行了审查, 例如: 时点、 平均正方差、 其与受限估算的等值, 以及它与巴耶斯回归的关系。 最后, 其行为和使用在模拟和 ommics 数据中加以说明。 随后, 脊峰回归将普遍化, 以允许更普遍的处罚。 然后, 脊脊回归框架将转化为物流回归, 其特性将显示为持续。 为了对比受限的估算, 最后一章将介绍其弧素对应值。