Supervised learning aims to train a classifier under the assumption that training and test data are from the same distribution. To ease the above assumption, researchers have studied a more realistic setting: out-of-distribution (OOD) detection, where test data may come from classes that are unknown during training (i.e., OOD data). Due to the unavailability and diversity of OOD data, good generalization ability is crucial for effective OOD detection algorithms. To study the generalization of OOD detection, in this paper, we investigate the probably approximately correct (PAC) learning theory of OOD detection, which is proposed by researchers as an open problem. First, we find a necessary condition for the learnability of OOD detection. Then, using this condition, we prove several impossibility theorems for the learnability of OOD detection under some scenarios. Although the impossibility theorems are frustrating, we find that some conditions of these impossibility theorems may not hold in some practical scenarios. Based on this observation, we next give several necessary and sufficient conditions to characterize the learnability of OOD detection in some practical scenarios. Lastly, we also offer theoretical supports for several representative OOD detection works based on our OOD theory.
翻译:为了缓解上述假设,研究人员研究了一个更现实的假设:在分配之外检测(OOD),测试数据可能来自培训期间不为人知的班级(OOD数据);由于OOD数据的缺乏和多样性,良好的概括化能力对于有效OOOD检测算法至关重要;为了研究OOOD检测的普及性,我们在本文件中调查了可能大致正确(PAC)的OOOD检测学习理论,这是研究人员作为公开问题提出的。首先,我们为OOOD检测的可学习性找到一个必要条件。然后,利用这一条件,我们证明在一些情景下,无法对OOD检测的可学习性有几种理论依据。虽然不可能的标语令人沮丧,但我们发现,这些不可能的标语在某些实际情景下可能存在某些条件。根据这一观察,我们接下来给出若干必要和充分的条件来说明OOD检测在实际情景上是否可学习 OOD检测。最后,我们还从理论上支持我们的一些OOD检测工作。