We study three notions of uncertainty quantification -- calibration, confidence intervals and prediction sets -- for binary classification in the distribution-free setting, that is without making any distributional assumptions on the data. With a focus towards calibration, we establish a 'tripod' of theorems that connect these three notions for score-based classifiers. A direct implication is that distribution-free calibration is only possible, even asymptotically, using a scoring function whose level sets partition the feature space into at most countably many sets. Parametric calibration schemes such as variants of Platt scaling do not satisfy this requirement, while nonparametric schemes based on binning do. To close the loop, we derive distribution-free confidence intervals for binned probabilities for both fixed-width and uniform-mass binning. As a consequence of our 'tripod' theorems, these confidence intervals for binned probabilities lead to distribution-free calibration. We also derive extensions to settings with streaming data and covariate shift.
翻译:我们研究了三个不确定性量化概念 -- -- 校准、信心间隔和预测组 -- -- 用于无分布式环境的二进制分类,即不对数据作任何分布性假设。我们以校准为重点,建立了一个将这三个概念连接到基于分数的分类器的“三进制”理论的“三进制”概念。一个直接的含意是,使用一个分数函数,即将地物空间分隔在最多可以计算到的数组的分级功能,只能进行无分布式校准。参数校准方案,如普莱特缩放的变量,不能满足这一要求,而基于宾客制的非参数方案则不能满足这一要求。为了关闭环形,我们为固定线和统一质的交配制的双进制概率设定了无分布式信任间隔。由于我们的“三进制”标,这些分位概率的置准间隔导致无分布式校准。我们还在数据流式和组合变换的环境下进行扩展。