Trustworthy deployment of ML models requires a proper measure of uncertainty, especially in safety-critical applications. We focus on uncertainty quantification (UQ) for classification problems via two avenues -- prediction sets using conformal prediction and calibration of probabilistic predictors by post-hoc binning -- since these possess distribution-free guarantees for i.i.d. data. Two common ways of generalizing beyond the i.i.d. setting include handling covariate and label shift. Within the context of distribution-free UQ, the former has already received attention, but not the latter. It is known that label shift hurts prediction, and we first argue that it also hurts UQ, by showing degradation in coverage and calibration. Piggybacking on recent progress in addressing label shift (for better prediction), we examine the right way to achieve UQ by reweighting the aforementioned conformal and calibration procedures whenever some unlabeled data from the target distribution is available. We examine these techniques theoretically in a distribution-free framework and demonstrate their excellent practical performance.
翻译:值得信赖的ML模型的部署要求适当测量不确定性,特别是在安全关键应用方面。我们侧重于不确定性量化(UQ),通过两种途径处理分类问题 -- -- 使用符合要求的预测器进行符合要求的预测和通过热后宾馆对概率预测器进行校准 -- -- 因为这些系统对i.d.数据拥有无分配保障的保证。除i.d.设置外,普遍推广的两种常见方法包括处理共变和标签转换。在无分配的UQ范围内,前者已经受到注意,但后者没有受到注意。众所周知,标签的转换会伤害预测,我们首先认为,它也会伤害UQ,因为它显示覆盖和校准的退化。在近期处理标签转换方面的进展(为了更好的预测)上,我们研究实现UQ的正确方法,方法是在有目标分发的无标签数据时,重新加权上述的校准程序。我们从理论上在无分配框架内审查这些技术,并展示其出色的实用性。