It is shown how to efficiently and accurately compute and optimize a range of cross validation criteria for a wide range of models estimated by minimizing a quadratically penalized smooth loss. Example models include generalized additive models for location scale and shape and smooth additive quantile regression. Example losses include negative log likelihoods and smooth quantile losses. Example cross validation criteria include leave-out-neighbourhood cross validation for dealing with un-modelled short range autocorrelation as well as the more familiar leave-one-out cross validation. For a $p$ coefficient model of $n$ data, estimable at $O(np^2)$ computational cost, the general $O(n^2p^2)$ cost of ordinary cross validation is reduced to $O(np^2)$, computing the cross validation criterion to $O(p^3n^{-2})$ accuracy. This is achieved by directly approximating the model coefficient estimates under data subset omission, via efficiently computed single step Newton updates of the full data coefficient estimates. Optimization of the resulting cross validation criterion, with respect to multiple smoothing/precision parameters, can be achieved efficiently using quasi-Newton optimization, adapted to deal with the indefiniteness that occurs when the optimal value for a smoothing parameter tends to infinity. The link between cross validation and the jackknife can be exploited to achieve reasonably well calibrated uncertainty quantification for the model coefficients in non standard settings such as leaving-out-neighbourhoods under residual autocorrelation or quantile regression. Several practical examples are provided, focussing particularly on dealing with un-modelled auto-correlation.
翻译:暂无翻译