We discuss the prediction accuracy of assumed statistical models in terms of prediction errors for the generalized linear model and penalized maximum likelihood methods. We derive the forms of estimators for the prediction errors: Cp criterion, information criteria, and leave-one-out cross validation (LOOCV) error, using the generalized approximate message passing (GAMP) algorithm and replica method. These estimators coincide with each other when the number of model parameters is sufficiently small; however, there is a discrepancy between them in particular in the overparametrized region where the number of model parameters is larger than the data dimension. In this paper, we review the prediction errors and corresponding estimators, and discuss their differences. In the framework of GAMP, we show that the information criteria can be understood as a fluctuation-response relationship. Further, we demonstrate how to approach LOOCV error from the information criteria by utilizing the expression provided by GAMP.
翻译:我们用通用线性模型的预测误差和受处罚的最大可能性方法来讨论假设统计模型的预测准确性,我们从预测误差的估测器中得出预测误差的形式:Cp 标准、信息标准和放出一分空的交叉验证(LOOCV)错误,使用通用的大致信息传递算法和复制法。当模型参数数量足够小时,这些估计器相互重叠;然而,在模型参数数量大于数据层面的过度平衡区域,它们之间尤其存在差异。我们在本文件中审查了预测误差和相应的估计误差,并讨论了它们的差异。在GAMP的框架内,我们表明信息标准可以被理解为一种波动-反应关系。此外,我们通过使用GAMP提供的表达方式,说明如何从信息标准中处理LOCV误差。