Data augmentation has emerged as a powerful technique for improving the performance of deep neural networks and led to state-of-the-art results in computer vision. However, state-of-the-art data augmentation strongly distorts training images, leading to a disparity between examples seen during training and inference. In this work, we explore a recently proposed training paradigm in order to correct for this disparity: using an auxiliary BatchNorm for the potentially out-of-distribution, strongly augmented images. Our experiments then focus on how to define the BatchNorm parameters that are used at evaluation. To eliminate the train-test disparity, we experiment with using the batch statistics defined by clean training images only, yet surprisingly find that this does not yield improvements in model performance. Instead, we investigate using BatchNorm parameters defined by weak augmentations and find that this method significantly improves the performance of common image classification benchmarks such as CIFAR-10, CIFAR-100, and ImageNet. We then explore a fundamental trade-off between accuracy and robustness coming from using different BatchNorm parameters, providing greater insight into the benefits of data augmentation on model performance.
翻译:增强数据已成为改善深神经网络的功能的有力技术,并导致计算机视觉方面最先进的结果。然而,最先进的数据增强极大地扭曲了培训图像,导致在培训期间和推论期间所看到的例子之间存在差异。在这项工作中,我们探索了最近提出的培训范例,以纠正这一差异:利用辅助批量Norm来修复可能超出分布范围、放大的图像。然后,我们的实验侧重于如何界定评价中使用的批量Norm参数。为了消除火车测试差异,我们实验使用由清洁培训图像界定的批量统计数据,但令人惊讶的是,这并没有改善模型性能。相反,我们利用弱的增强所定义的批量Norm参数进行调查,发现这种方法大大改善了通用图像分类基准的性能,如CIFAR-10、CIFAR-100和图像网络。然后我们探索从使用不同的批量Norm参数中获得的准确性和稳健性之间的根本权衡,从而更清楚地了解模型性能扩大数据的效益。