While successful for various computer vision tasks, deep neural networks have shown to be vulnerable to texture style shifts and small perturbations to which humans are robust. In this work, we show that the robustness of neural networks can be greatly improved through the use of random convolutions as data augmentation. Random convolutions are approximately shape-preserving and may distort local textures. Intuitively, randomized convolutions create an infinite number of new domains with similar global shapes but random local textures. Therefore, we explore using outputs of multi-scale random convolutions as new images or mixing them with the original images during training. When applying a network trained with our approach to unseen domains, our method consistently improves the performance on domain generalization benchmarks and is scalable to ImageNet. In particular, in the challenging scenario of generalizing to the sketch domain in PACS and to ImageNet-Sketch, our method outperforms state-of-art methods by a large margin. More interestingly, our method can benefit downstream tasks by providing a more robust pretrained visual representation.
翻译:尽管在各种计算机视觉任务中取得了成功,但深神经网络已经表明很容易受到纹理风格变化和人类强大能接触到的小扰动的影响。 在这项工作中,我们表明,通过使用随机变异作为数据增强,神经网络的稳健性可以大大提高。 随机变迁大约是形状保留, 并可能扭曲本地纹理。 直觉上, 随机变迁创造出数量无限的新领域, 其全球形状相似, 但随机的本地纹理。 因此, 我们探索使用多尺度随机变异的输出作为新图像, 或在培训期间将其与原始图像混在一起。 在对未知域应用经过我们培训的网络时, 我们的方法会不断改进域通用基准的性能, 并且可以向图像网络进行缩放 。 特别是, 在具有挑战性的情景中, 我们的方法会以大幅度的图像网络- Sketch 来超越状态。 更有趣的是, 我们的方法可以通过提供更稳健的预选前视觉代表来帮助下游转任务。