To ensure robust and reliable classification results, OoD (out-of-distribution) indicators based on deep generative models are proposed recently and are shown to work well on small datasets. In this paper, we conduct the first large collection of benchmarks (containing 92 dataset pairs, which is 1 order of magnitude larger than previous ones) for existing OoD indicators and observe that none perform well. We thus advocate that a large collection of benchmarks is mandatory for evaluating OoD indicators. We propose a novel theoretical framework, DOI, for divergence-based Out-of-Distribution indicators (instead of traditional likelihood-based) in deep generative models. Following this framework, we further propose a simple and effective OoD detection algorithm: Single-shot Fine-tune. It significantly outperforms past works by 5~8 in AUROC, and its performance is close to optimal. In recent, the likelihood criterion is shown to be ineffective in detecting OoD. Single-shot Fine-tune proposes a novel fine-tune criterion to detect OoD, by whether the likelihood of the testing sample is improved after fine-tuning a well-trained model on it. Fine-tune criterion is a clear and easy-following criterion, which will lead the OoD domain into a new stage.
翻译:为确保稳健和可靠的分类结果,最近提出了基于深层基因模型的OOD(分配外)指标,并显示这些指标在小型数据集方面效果良好。在本文件中,我们为现有的OOD指标首次收集了大量基准(包含92个数据集配对,比以前的指标大一个数量级),并观察到没有哪个指标效果良好。我们因此主张,为评价OOD指标,必须收集大量基准。我们建议在深层基因模型中为基于差异的传播外指标(而不是传统的基于可能性的指标)提出一个新的理论框架DOI。在此框架之后,我们进一步提出一个简单有效的OOD检测算法:单发微调。它大大超过AUROC过去的工作5~8,其性能接近最佳。最近,事实证明,在发现OOD指标方面,可能没有效率。单发“微调”微调“微调图”提出一个新的微调标准,以探测OOOD,其测试样品在经过精细调整后能否改进,将易于遵循新的标准。