Robustness is a fundamental pillar of Machine Learning (ML) classifiers, substantially determining their reliability. Methods for assessing classifier robustness are therefore essential. In this work, we address the challenge of evaluating corruption robustness in a way that allows comparability and interpretability on a given dataset. We propose a test data augmentation method that uses a robustness distance $\epsilon$ derived from the datasets minimal class separation distance. The resulting MSCR (mean statistical corruption robustness) metric allows a dataset-specific comparison of different classifiers with respect to their corruption robustness. The MSCR value is interpretable, as it represents the classifiers avoidable loss of accuracy due to statistical corruptions. On 2D and image data, we show that the metric reflects different levels of classifier robustness. Furthermore, we observe unexpected optima in classifiers robust accuracy through training and testing classifiers with different levels of noise. While researchers have frequently reported on a significant tradeoff on accuracy when training robust models, we strengthen the view that a tradeoff between accuracy and corruption robustness is not inherent. Our results indicate that robustness training through simple data augmentation can already slightly improve accuracy.
翻译:强力是机器学习(ML)分类的一个根本支柱,它在很大程度上决定了其可靠性。因此,评估分类稳健性的方法至关重要。在这项工作中,我们应对评估腐败稳健性的挑战,使给定数据集的可比性和可解释性成为可能。我们提出一个测试数据增强方法,使用从数据集中得出的强健性距离$\epsilon美元,从最小级分离距离中得出。由此产生的MSCR(平均统计腐败稳健性)指标使得不同分类者能够对其稳健性进行数据集具体比较。MSCR值是可以解释的,因为它代表分类者可以避免因统计腐败而导致的准确性损失。关于2D和图像数据,我们显示该指标反映了分类稳健性的不同水平。此外,我们通过培训和测试噪音程度不同的分类者,在分类中观察到出出出乎意料的节率。研究人员经常报告在培训稳健性模型时对准确性进行重大权衡,但我们加强了在准确性与腐败稳健性之间的权衡。我们的结果表明,通过简单的数据增强性进行稳健性培训可以稍微提高准确性。