An intuitive way to detect out-of-distribution (OOD) data is via the density function of a fitted probabilistic generative model: points with low density may be classed as OOD. But this approach has been found to fail, in deep learning settings. In this paper, we list some falsehoods that machine learning researchers believe about density-based OOD detection. Many recent works have proposed likelihood-ratio-based methods to `fix' the problem. We propose a framework, the OOD proxy framework, to unify these methods, and we argue that likelihood ratio is a principled method for OOD detection and not a mere `fix'. Finally, we discuss the relationship between domain discrimination and semantics.
翻译:检测分配外数据的一种直觉方法是通过一个安装的概率型基因模型的密度功能来检测数据:低密度点可被归类为OOD。但是,在深层学习环境中,这一方法被发现失败。在本文中,我们列举了机器学习研究人员相信以密度为基础的OOD检测的一些谬误。许多最近的工作都提出了“解决问题”的基于概率的方法。我们提议了一个框架,即OOOD代用框架,以统一这些方法,我们主张可能性比率是检测OOD的一种有原则的方法,而不仅仅是“固定”的方法。最后,我们讨论了领域歧视与语义之间的关系。