We study out-of-distribution (OOD) prediction behavior of neural networks when they classify images from unseen classes or corrupted images. To probe the OOD behavior, we introduce a new measure, nearest category generalization (NCG), where we compute the fraction of OOD inputs that are classified with the same label as their nearest neighbor in the training set. Our motivation stems from understanding the prediction patterns of adversarially robust networks, since previous work has identified unexpected consequences of training to be robust to norm-bounded perturbations. We find that robust networks have consistently higher NCG accuracy than natural training, even when the OOD data is much farther away than the robustness radius. This implies that the local regularization of robust training has a significant impact on the network's decision regions. We replicate our findings using many datasets, comparing new and existing training methods. Overall, adversarially robust networks resemble a nearest neighbor classifier when it comes to OOD data. Code available at https://github.com/yangarbiter/nearest-category-generalization.
翻译:我们研究神经网络在对来自隐蔽类别或腐败图像的图像进行分类时的分布(OOOD)预测行为。为了调查OOD行为,我们引入了一种新的措施,即最近的分类(NGG),即我们计算OOD投入的分数与培训组中最近的邻居的标签相同。我们的动机来自理解敌对强势网络的预测模式,因为先前的工作已经确定了培训对受规范限制的扰动具有强力的意外后果。我们发现,强势网络的NCG准确性一直高于自然培训,即使OOD数据远比强度半径远。这意味着稳健培训的本地正规化对网络的决策区域有重大影响。我们利用许多数据集复制我们的调查结果,比较新的和现有的培训方法。总体而言,强势网络在使用OD数据时类似于最近的邻居分类器。代码可在https://github.com/yangarbiter/nearest-分类中查阅。</s>