Conditional density models f(y|x), where x represents a potentially high-dimensional feature vector, are an integral part of uncertainty quantification in prediction and Bayesian inference. However, such models can be difficult to calibrate. While existing validation techniques can determine whether an approximated conditional density is compatible overall with a data sample, they lack practical procedures for identifying, localizing, and interpreting the nature of (statistically significant) discrepancies over the entire feature space. In this paper, we present more discerning diagnostics such as (i) the "Local Coverage Test" (LCT), which is able to distinguish an arbitrarily misspecified model from the true conditional density of the sample, and (ii) "Amortized Local P-P plots" (ALP), which can quickly provide interpretable graphical summaries of distributional differences at any location x in the feature space. Our validation procedures scale to high dimensions, and can potentially adapt to any type of data at hand. We demonstrate the effectiveness of LCT and ALP through a simulated experiment and a realistic application to parameter inference for galaxy images.
翻译:条件密度模型f(y ⁇ x)是X代表潜在高维特征矢量的,是预测和巴伊西亚推断中不确定性量化的一个有机组成部分,但这种模型可能难以校准。虽然现有的验证技术可以确定大约的有条件密度总体上是否与数据样本相容,但它们缺乏查明、定位和解释整个特征空间(具有统计重要性的)差异性质的实用程序。在本文中,我们提供了更多的辨别诊断,如(一)“地方覆盖测试”,它能够将任意确定的模型与抽样的真正条件密度区分开来,以及(二)“模拟实验和对星系图像参数的现实应用,从而能够迅速提供可解释的本地P-P图示,该图解可快速提供特征空间任何地点x的分布差异的图形摘要。我们的验证程序尺度达到高维度,并有可能适应手头的任何类型的数据。我们通过模拟实验和对星系图像参数进行实际应用来证明LCT和ALP的有效性。