Despite the impressive performance of deep networks in vision, language, and healthcare, unpredictable behaviors on samples from the distribution different than the training distribution cause severe problems in deployment. For better reliability of neural-network-based classifiers, we define a new task, natural attribute-based shift (NAS) detection, to detect the samples shifted from the training distribution by some natural attribute such as age of subjects or brightness of images. Using the natural attributes present in existing datasets, we introduce benchmark datasets in vision, language, and medical for NAS detection. Further, we conduct an extensive evaluation of prior representative out-of-distribution (OOD) detection methods on NAS datasets and observe an inconsistency in their performance. To understand this, we provide an analysis on the relationship between the location of NAS samples in the feature space and the performance of distance- and confidence-based OOD detection methods. Based on the analysis, we split NAS samples into three categories and further suggest a simple modification to the training objective to obtain an improved OOD detection method that is capable of detecting samples from all NAS categories.
翻译:尽管深层网络在视觉、语言和保健方面的表现令人印象深刻,但分布不同于培训分布的样本的不可预测行为造成了严重的部署问题。为了提高神经网络分类的可靠性,我们界定了一项新的任务,即自然属性转换(NAS)检测,以检测从培训分布中转移的样本,如主题年龄或图像亮度等某些自然属性;利用现有数据集中存在的自然属性,我们采用视觉、语言和医学基准数据集,用于国家卫星探测。此外,我们广泛评价了以前在国家卫星数据集上的代表性外检测方法,并观察到其性能不一致。为了了解这一点,我们分析了国家卫星在地貌空间的样本位置与远程和信任的OOD检测方法的性能之间的关系。根据分析,我们将国家卫星样本分为三类,并进一步建议对培训目标进行简单修改,以便获得改进的OOD检测方法,从而能够检测所有国家卫星的样本。