Model robustness indicates a model's capability to generalize well on unforeseen distributional shifts, including data corruptions and adversarial attacks. Data augmentation is one of the most prevalent and effective ways to enhance robustness. Despite the great success of the diverse augmentations in different fields, a unified theoretical understanding of their efficacy in improving model robustness is lacking. We theoretically reveal a general condition for label-preserving augmentations to bring robustness to diverse distribution shifts through the lens of flat minima and generalization bound, which de facto turns out to be strongly correlated with robustness against different distribution shifts in practice. Unlike most earlier works, our theoretical framework accommodates all the label-preserving augmentations and is not limited to particular distribution shifts. We substantiate our theories through different simulations on the existing common corruption and adversarial robustness benchmarks based on the CIFAR and ImageNet datasets.
翻译:模型鲁棒性指模型在面对未预见的分布偏移(包括数据损坏和对抗攻击)时仍能保持良好泛化能力。数据增强是提升鲁棒性最普遍且有效的方法之一。尽管各类增强方法在不同领域取得了显著成功,但对其提升模型鲁棒性效能的理论理解仍缺乏统一框架。我们通过平坦极小值与泛化界的理论视角,揭示了标签保持型数据增强为模型带来针对多种分布偏移鲁棒性的普适条件,该条件在实践中被证实与不同分布偏移下的鲁棒性高度相关。与多数早期研究不同,我们的理论框架适用于所有标签保持型增强方法,且不局限于特定分布偏移类型。我们基于CIFAR和ImageNet数据集上的常见损坏基准与对抗鲁棒性基准,通过多组仿真实验验证了理论的有效性。