Prediction, where observed data is used to quantify uncertainty about a future observation, is a fundamental problem in statistics. Prediction sets with coverage probability guarantees are a common solution, but these do not provide probabilistic uncertainty quantification in the sense of assigning beliefs to relevant assertions about the future observable. Alternatively, we recommend the use of a probabilistic predictor, a data-dependent (imprecise) probability distribution for the to-be-predicted observation given the observed data. It is essential that the probabilistic predictor be reliable or valid, and here we offer a notion of validity and explore its behavioral and statistical implications. In particular, we show that valid probabilistic predictors avoid sure loss and lead to prediction procedures with desirable frequentist error rate control properties. We also provide a general inferential model construction that yields a provably valid probabilistic predictor, and we illustrate this construction in regression and classification applications.
翻译:观察到的数据用于量化未来观测的不确定性,预测是统计中的一个根本问题。具有覆盖概率保障的预测数据集是一个共同的解决办法,但并不提供将信念归属于有关未来可观测数据的相关主张的概率不确定性量化。或者,我们建议使用概率预测器,即根据观察到的数据,对未来观测进行数据依赖(不精确)概率分布。概率预测器必须可靠或有效,我们在此提出一个有效性概念,并探讨其行为和统计影响。特别是,我们表明有效的概率预测器避免肯定损失,并导致预测程序,同时具有可取的频繁误率控制特性。我们还提供了一种一般的推论模型,产生一种可证实有效概率预测器,我们用回归和分类应用来说明这一推论。