Robustness in machine learning is commonly studied in the adversarial setting, yet real-world noise (such as measurement noise) is random rather than adversarial. Model behavior under such noise is captured by average-case robustness, i.e., the probability of obtaining consistent predictions in a local region around an input. However, the na\"ive approach to computing average-case robustness based on Monte-Carlo sampling is statistically inefficient, especially for high-dimensional data, leading to prohibitive computational costs for large-scale applications. In this work, we develop the first analytical estimators to efficiently compute average-case robustness of multi-class discriminative models. These estimators linearize models in the local region around an input and analytically compute the robustness of the resulting linear models. We show empirically that these estimators efficiently compute the robustness of standard deep learning models and demonstrate these estimators' usefulness for various tasks involving robustness, such as measuring robustness bias and identifying dataset samples that are vulnerable to noise perturbation. In doing so, this work not only proposes a new framework for robustness, but also makes its computation practical, enabling the use of average-case robustness in downstream applications.
翻译:暂无翻译