We introduce a set of image transformations that can be used as corruptions to evaluate the robustness of models as well as data augmentation mechanisms for training neural networks. The primary distinction of the proposed transformations is that, unlike existing approaches such as Common Corruptions, the geometry of the scene is incorporated in the transformations -- thus leading to corruptions that are more likely to occur in the real world. We also introduce a set of semantic corruptions (e.g. natural object occlusions). We show these transformations are `efficient' (can be computed on-the-fly), `extendable' (can be applied on most image datasets), expose vulnerability of existing models, and can effectively make models more robust when employed as `3D data augmentation' mechanisms. The evaluations on several tasks and datasets suggest incorporating 3D information into benchmarking and training opens up a promising direction for robustness research.
翻译:我们引入了一套可用作腐败的图像转换方法,用以评价模型的稳健性以及培训神经网络的数据增强机制。拟议转换的主要区别是,与普通腐败等现有方法不同,场景的几何特征被纳入了转型中,从而导致在现实世界更有可能发生的腐败。我们还引入了一套语义腐败(例如自然物体隔离)。我们展示了这些转型是“有效的”(可以在实时上计算 ), “可扩展” (可以在大多数图像数据集上应用 ), 暴露了现有模型的脆弱性,并且能够有效地使模型在用作“3D数据增强”机制时更加坚固。对若干任务和数据集的评估表明,将3D信息纳入基准和培训中,为稳健性研究开辟了充满希望的方向。