Black-box machine learning learning methods are now routinely used in high-risk settings, like medical diagnostics, which demand uncertainty quantification to avoid consequential model failures. Distribution-free uncertainty quantification (distribution-free UQ) is a user-friendly paradigm for creating statistically rigorous confidence intervals/sets for such predictions. Critically, the intervals/sets are valid without distributional assumptions or model assumptions, possessing explicit guarantees even with finitely many datapoints. Moreover, they adapt to the difficulty of the input; when the input example is difficult, the uncertainty intervals/sets are large, signaling that the model might be wrong. Without much work and without retraining, one can use distribution-free methods on any underlying algorithm, such as a neural network, to produce confidence sets guaranteed to contain the ground truth with a user-specified probability, such as 90%. Indeed, the methods are easy-to-understand and general, applying to many modern prediction problems arising in the fields of computer vision, natural language processing, deep reinforcement learning, and so on. This hands-on introduction is aimed at a reader interested in the practical implementation of distribution-free UQ who is not necessarily a statistician. We lead the reader through the practical theory and applications of distribution-free UQ, beginning with conformal prediction and culminating with distribution-free control of any risk, such as the false-discovery rate, false positive rate of out-of-distribution detection, and so on. We will include many explanatory illustrations, examples, and code samples in Python, with PyTorch syntax. The goal is to provide the reader a working understanding of distribution-free UQ, allowing them to put confidence intervals on their algorithms, with one self-contained document.
翻译:在高风险环境中,如医学诊断,现在常规地使用黑盒机器学习方法,例如医疗诊断,这种诊断要求不确定性量化,以避免导致模型失败。无分配的不确定性量化(无分配的UQ)是一个方便用户的范例,用于为这种预测创建统计上严格的信心间隔/设置。关键地说,间隔/设置是有效的,没有分发假设或模型假设,即使有有限的许多数据点也拥有明确的保障。此外,它们适应输入困难;当输入实例困难时,不确定性间隔/设置很大,表明模型可能是错误的。如果不做很多工作和不进行再培训,人们可以对任何基本算法,例如神经网络,使用无分配的无分配方法,以便产生可靠的信心,以用户特定的可能性,例如90%的概率来控制地面真相。事实上,这些方法容易理解和笼统,适用于计算机愿景、自然语言处理、深度强化学习等领域出现的许多现代预测问题。这个直接介绍旨在让读者对无分配自由的UQ的运行过程感兴趣, 向读者们展示一个免费的运行和最终理解率,他一定地遵守目标分配率的传播率。我们通过一个免费的计算, 将一个正常的销售率,我们通过一个正常的计算, 将一个正常的销售率到一个正常的计算。