A learning procedure takes as input a dataset and performs inference for the parameters $\theta$ of a model that is assumed to have given rise to the dataset. Here we consider learning procedures whose output is a probability distribution, representing uncertainty about $\theta$ after seeing the dataset. Bayesian inference is a prime example of such a procedure but one can also construct other learning procedures that return distributional output. This paper studies conditions for a learning procedure to be considered calibrated, in the sense that the true data-generating parameters are plausible as samples from its distributional output. A learning procedure that is calibrated need not be statistically efficient and vice versa. A hypothesis-testing framework is developed in order to assess, using simulation, whether a learning procedure is calibrated. Finally, we exploit our framework to test the calibration of some learning procedures that are motivated as being approximations to Bayesian inference but are nevertheless widely used.
翻译:学习程序将假设生成数据集的模型的值值当做输入数据集,并对该模型的参数值进行推论。 这里我们考虑的是其产出为概率分布的学习程序,在看到数据集后对美元值表示不确定性。 贝叶斯推论是这种程序的一个典型例子,但也可以建立返回分布输出的其他学习程序。 本文研究将被视为校准学习程序的条件, 即真正的数据生成参数作为分布输出的样本是有道理的。 校准的学习程序不需要具有统计效率,反之亦然。 开发了一个假设测试框架, 以便利用模拟评估学习程序是否经过校准。 最后, 我们利用我们的框架测试某些学习程序的校准, 这些程序的动机是接近贝叶斯斯语的推理, 但是被广泛使用。