Synthetic corruptions gathered into a benchmark are frequently used to measure neural network robustness to distribution shifts. However, robustness to synthetic corruption benchmarks is not always predictive of robustness to distribution shifts encountered in real-world applications. In this paper, we propose a methodology to build synthetic corruption benchmarks that make robustness estimations more correlated with robustness to real-world distribution shifts. Using the overlapping criterion, we split synthetic corruptions into categories that help to better understand neural network robustness. Based on these categories, we identify three parameters that are relevant to take into account when constructing a corruption benchmark: number of represented categories, balance among categories and size of benchmarks. Applying the proposed methodology, we build a new benchmark called ImageNet-Syn2Nat to predict image classifier robustness.
翻译:收集成基准的合成腐败经常被用来衡量神经网络的稳健性,以衡量分布变化。然而,合成腐败基准的稳健性并不总是预测真实世界应用中遇到的分布变化的稳健性。在本文件中,我们提出了一种方法来建立合成腐败基准,使稳健性估计与真实世界分布变化更加相关。我们使用重叠标准,将合成腐败分为有助于更好地了解神经网络稳健性的类别。根据这些类别,我们确定了在建立腐败基准时需要考虑的三个相关参数:代表的类别数量、类别和基准规模之间的平衡。我们运用拟议方法,建立了一个名为图像网络-Syn2Nat的新基准,以预测图像分类的稳健性。