Over the last decade, the development of deep image classification networks has mostly been driven by the search for the best performance in terms of classification accuracy on standardized benchmarks like ImageNet. More recently, this focus has been expanded by the notion of model robustness, i.e. the generalization abilities of models towards previously unseen changes in the data distribution. While new benchmarks, like ImageNet-C, have been introduced to measure robustness properties, we argue that fixed testsets are only able to capture a small portion of possible data variations and are thus limited and prone to generate new overfitted solutions. To overcome these drawbacks, we suggest to estimate the robustness of a model directly from the structure of its learned feature-space. We introduce robustness indicators which are obtained via unsupervised clustering of latent representations inside a trained classifier and show very high correlations to the model performance on corrupted test data.
翻译:过去十年来,开发深层图像分类网络的动力主要在于在图像网络等标准化基准的分类准确性方面寻求最佳业绩,最近,由于模型稳健性概念,即模型的概括性能力使数据分布中以往看不见的变化趋于普遍化,这一重点扩大了。虽然采用了像图像网络-C这样的新基准来衡量稳健性,但我们认为,固定测试仪只能捕捉一小部分可能的数据变异,因此其范围有限,容易产生新的过大的解决办法。为了克服这些缺陷,我们建议直接从其学习的特征空间结构中估算模型的稳健性。我们引入了强性指标,这些指标是通过在经过训练的分类仪内部未经监督地组合潜在代表体获得的,显示与腐败测试数据模型性能的高度相关性。