Benchmark performance of deep learning classifiers alone is not a reliable predictor for the performance of a deployed model. In particular, if the image classifier has picked up spurious features in the training data, its predictions can fail in unexpected ways. In this paper, we develop a framework that allows us to systematically identify spurious features in large datasets like ImageNet. It is based on our neural PCA components and their visualization. Previous work on spurious features of image classifiers often operates in toy settings or requires costly pixel-wise annotations. In contrast, we validate our results by checking that presence of the harmful spurious feature of a class is sufficient to trigger the prediction of that class. We introduce a novel dataset "Spurious ImageNet" and check how much existing classifiers rely on spurious features.
翻译:仅凭深层学习分类器的基准性能本身并不能可靠地预测被部署模型的性能。 特别是, 如果图像分类器在培训数据中发现了虚假的特征, 其预测可能会以出乎意料的方式失败。 在本文中, 我们开发了一个框架, 使我们能够系统地识别像图像网络这样的大型数据集中的虚假特征。 它基于我们的神经多功能元件及其可视化。 先前关于图像分类器的虚假特征的工作经常在玩具环境中运作, 或者需要昂贵的像素说明。 相反, 我们通过检查某一类中存在有害的虚假特征足以引发该类的预测来验证我们的结果。 我们引入了一个新的数据集“ 纯化图像网络 ”, 并检查现有分类器多多少依赖虚假特征 。