We discuss the prediction accuracy of assumed statistical models in terms of prediction errors for the generalized linear model and penalized maximum likelihood methods. We derive the forms of estimators for the prediction errors, such as $C_p$ criterion, information criteria, and leave-one-out cross validation (LOOCV) error, using the generalized approximate message passing (GAMP) algorithm and replica method. These estimators coincide with each other when the number of model parameters is sufficiently small; however, there is a discrepancy between them in particular in the parameter region where the number of model parameters is larger than the data dimension. In this paper, we review the prediction errors and corresponding estimators, and discuss their differences. In the framework of GAMP, we show that the information criteria can be expressed by using the variance of the estimates. Further, we demonstrate how to approach LOOCV error from the information criteria by utilizing the expression provided by GAMP.
翻译:我们讨论假设的统计模型的预测准确性,即通用线性模型的预测误差和受处罚的最大可能性方法,我们利用通用大致信息传递算法和复制法,得出预测误差的估测器,如$C_p$标准、信息标准和放出1美元交叉验证(LOOCV)错误,这些估计器在模型参数数量足够小时相互重叠;然而,在模型参数数量大于数据层面的参数区域,它们之间尤其存在差异;在本文件中,我们审查预测误差和相应的估计器,并讨论其差异;在GAMP框架内,我们表明信息标准可以通过使用估计数的差异来表达;此外,我们通过使用GAMP提供的表达方式,说明如何从信息标准中处理LOOCV错误。