We discuss the prediction accuracy of assumed statistical models in terms of prediction errors for the generalized linear model and penalized maximum likelihood methods. We derive the forms of estimators for the prediction errors: $C_p$ criterion, information criteria, and leave-one-out cross validation (LOOCV) error, using the generalized approximate message passing (GAMP) algorithm and replica method. These estimators coincide with each other when the number of model parameters is sufficiently small; however, there is a discrepancy between them in particular in the overparametrized region where the number of model parameters is larger than the data dimension. In this paper, we review the prediction errors and corresponding estimators, and discuss their differences. In the framework of GAMP, we show that the information criteria can be expressed by using the variance of the estimates. Further, we demonstrate how to approach LOOCV error from the information criteria by utilizing the expression provided by GAMP.
翻译:我们用通用线性模型的预测误差和受处罚的最大可能性方法来讨论假设统计模型的预测准确性,我们从预测误差的估测数中得出预测误差的形式:$C_p$标准、信息标准和放出1美元交叉校验(LOOCV)错误,使用通用的大致信息传递算法和复制法。当模型参数数量足够小时,这些估计数字彼此吻合;然而,在模型参数数量大于数据层面的过度平衡区域,它们之间尤其存在差异。在本文件中,我们审查预测误差和相应的估计,并讨论其差异。在GAMP的框架内,我们表明信息标准可以通过使用估计数的差异来表达。此外,我们通过使用GAMP提供的表达方式,说明如何从信息标准中处理LOCV错误。