Detecting out-of-distribution (OOD) examples is critical in many applications. We propose an unsupervised method to detect OOD samples using a $k$-NN density estimate with respect to a classification model's intermediate activations on in-distribution samples. We leverage a recent insight about label smoothing, which we call the \emph{Label Smoothed Embedding Hypothesis}, and show that one of the implications is that the $k$-NN density estimator performs better as an OOD detection method both theoretically and empirically when the model is trained with label smoothing. Finally, we show that our proposal outperforms many OOD baselines and also provide new finite-sample high-probability statistical results for $k$-NN density estimation's ability to detect OOD examples.
翻译:在许多应用中,检测分配外(OOD)实例至关重要。 我们建议一种不受监督的方法来检测 OOD 样本。 我们建议使用一个以美元-NN为单位的密度估计值对一个分类模型在分布内样本上的中间激活进行检测。 我们利用最近关于标签平滑的洞察力,我们称之为 emph{Label平滑的嵌入式假药 }, 并表明其中的一个影响是,当模型经过标签平滑的培训时,$k$-NNN的密度估计值在理论上和实验上都作为OOD检测方法表现得更好。 最后,我们表明,我们的提案超过了OOD的许多基准,并为美元-NN的密度估计检测OOD示例的能力提供了新的有限抽样高概率统计结果。