Recent studies revealed that convolutional neural networks do not generalize well to small image transformations, e.g. rotations by a few degrees or translations of a few pixels. To improve the robustness to such transformations, we propose to introduce data augmentation at intermediate layers of the neural architecture, in addition to the common data augmentation applied on the input images. By introducing small perturbations to activation maps (features) at various levels, we develop the capacity of the neural network to cope with such transformations. We conduct experiments on three image classification benchmarks (Tiny ImageNet, Caltech-256 and Food-101), considering two different convolutional architectures (ResNet-18 and DenseNet-121). When compared with two state-of-the-art stabilization methods, the empirical results show that our approach consistently attains the best trade-off between accuracy and mean flip rate.
翻译:最近的研究显示,卷发神经网络没有很好地推广到小图像转换,例如几度旋转或翻译几像素。为了提高这种转换的稳健性,我们提议在神经结构的中间层引入数据增强功能,除了输入图像上应用的共同数据增强功能之外,在输入图像上引入共同的数据增强功能。通过在不同级别启动地图(地物)时引入小扰动功能,我们发展神经网络应对这种转换的能力。我们根据三种图像分类基准(Tiny imageNet、Caltech-256和Food-101)进行实验,我们考虑了两种不同的卷发结构(ResNet-18和DenseNet-121)。 与两种最先进的稳定方法相比,实验结果显示,我们的方法始终在精确率和平均翻转率之间实现最佳的权衡。