Detecting out-of-distribution (OOD) samples is vital for developing machine learning based models for critical safety systems. Common approaches for OOD detection assume access to some OOD samples during training which may not be available in a real-life scenario. Instead, we utilize the {\em predictive normalized maximum likelihood} (pNML) learner, in which no assumptions are made on the tested input. We derive an explicit expression of the pNML and its generalization error, denoted as the {\em regret}, for a single layer neural network (NN). We show that this learner generalizes well when (i) the test vector resides in a subspace spanned by the eigenvectors associated with the large eigenvalues of the empirical correlation matrix of the training data, or (ii) the test sample is far from the decision boundary. Furthermore, we describe how to efficiently apply the derived pNML regret to any pretrained deep NN, by employing the explicit pNML for the last layer, followed by the softmax function. Applying the derived regret to deep NN requires neither additional tunable parameters nor extra data. We extensively evaluate our approach on 74 OOD detection benchmarks using DenseNet-100, ResNet-34, and WideResNet-40 models trained with CIFAR-100, CIFAR-10, SVHN, and ImageNet-30 showing a significant improvement of up to 15.6\% over recent leading methods.
翻译:检测分布(OOD)样本对于开发关键安全系统的机器学习模型至关重要。 OOOD检测的通用方法假定在培训期间可以获取一些OOD样本,这些样本在现实情景下可能无法提供。相反,我们使用测试输入不作任何假设的“REML”学习者,在测试输入中不作任何假设。我们明确表达PNML及其一般错误,称为“FL'FR'r',用于单层神经网络。我们表明,该学习者在以下情况下非常概括化:(一) 测试矢量位于一个亚空间,该亚空间由与培训数据的经验性相关总基体的庞大的egen值相关值有关,或(二) 测试样品远离决定边界。此外,我们描述了如何有效地将衍生的PNML遗憾应用到任何未经事先培训的深层NNNNW,在最后一层使用明确的PNML,然后是软式函数。我们用经过广泛培训的O-NFAR-M-NBAR 模型来评估后,用经过广泛测试的15号S-R-R-R-RAS-CRAS-RAS-C-RAS-C-C-C-C-C-DM-C-C-RM-DM-RMDM-R-R-R-R-R-R-R-RM-R-R-R-R-R-R-RM-R-R-R-R-R-R-R-R-R-RM-RM-RM-DM-DM-DM-R-R-R-C-DM-DM-DM-C-C-C-C-C-R-R-T-T-T-T-T-DM-D-T-T-T-T-T-T-T-T-T-T-T-T-T-T-T-DM-NM-DM-NM-T-NM-NM-T-T-T-T-T-NM-T-T-T-D-DM-D-D-D-D-DM-T-D-D-D-T-T-T-T-T-T-T-