We propose a new regularization method to alleviate over-fitting in deep neural networks. The key idea is utilizing randomly transformed training samples to regularize a set of sub-networks, which are originated by sampling the width of the original network, in the training process. As such, the proposed method introduces self-guided disturbances to the raw gradients of the network and therefore is termed as Gradient Augmentation (GradAug). We demonstrate that GradAug can help the network learn well-generalized and more diverse representations. Moreover, it is easy to implement and can be applied to various structures and applications. GradAug improves ResNet-50 to 78.79% on ImageNet classification, which is a new state-of-the-art accuracy. By combining with CutMix, it further boosts the performance to 79.67%, which outperforms an ensemble of advanced training tricks. The generalization ability is evaluated on COCO object detection and instance segmentation where GradAug significantly surpasses other state-of-the-art methods. GradAug is also robust to image distortions and FGSM adversarial attacks and is highly effective in low data regimes. Code is available at https://github.com/taoyang1122/GradAug
翻译:我们提出了一种新的正规化方法,以缓解深层神经网络中的过度装配。关键的想法是利用随机改造的培训样本将一组子网络正规化,这些子网络来自在培训过程中对原始网络的宽度进行取样。因此,拟议方法将自导干扰引入网络的原始梯度,因此被称为“渐进增强”(GradAug),我们证明GradAug可以帮助网络学习广度化和更加多样化的表达方式。此外,实施和适用于各种结构和应用程序是容易的。GradAug在图像网络分类上改进ResNet-50至78.79%,这是一个新的最新状态的准确性。通过与CutMix的结合,将性能进一步提升到79.67%,这比高级培训技巧的共性增强(GradAug)还要强。我们证明,在GradAug显著超过其他州级方法的情况下,对COCO物体探测和实例分割能力进行了评估。GradAug还能够对图像扭曲和FGSM/AUB/QQA系统进行高效的数据。