An important challenge facing modern machine learning is how to rigorously quantify the uncertainty of model predictions. Conveying uncertainty is especially important when there are changes to the underlying data distribution that might invalidate the predictive model. Yet, most existing uncertainty quantification algorithms break down in the presence of such shifts. We propose a novel approach that addresses this challenge by constructing \emph{probably approximately correct (PAC)} prediction sets in the presence of covariate shift. Our approach focuses on the setting where there is a covariate shift from the source distribution (where we have labeled training examples) to the target distribution (for which we want to quantify uncertainty). Our algorithm assumes given importance weights that encode how the probabilities of the training examples change under the covariate shift. In practice, importance weights typically need to be estimated; thus, we extend our algorithm to the setting where we are given confidence intervals for the importance weights rather than their true value. We demonstrate the effectiveness of our approach on various covariate shifts designed based on the DomainNet and ImageNet datasets.
翻译:现代机器学习所面临的一项重大挑战是如何严格量化模型预测的不确定性。 当基础数据分布发生变化, 从而可能使预测模型失效时, 解析不确定性尤其重要。 然而, 大部分现有的不确定性量化算法在出现这种变化时会崩溃。 我们提出了一个新颖的方法,通过在共变变化的情况下构建 emph{ 可能大致正确 (PAC) 的预测组来应对这一挑战。 我们的方法侧重于从源分布(我们贴上培训范例的地方)向目标分布(我们想要量化不确定性的地方)的共变式转换。 我们的算法具有一定的份量,它能说明在共变式变化中培训实例的概率是如何变化的。 在实践中,重要性通常需要估算; 因此, 我们将我们的算法扩大到基于 DomainNet 和 图像网络 数据集设计的各种共变式转换方法的有效性。