Recent findings have shown that highly over-parameterized Neural Networks generalize without pretraining or explicit regularization. It is achieved with zero training error, i.e., complete over-fitting by memorizing the training data. This is surprising, since it is completely against traditional machine learning wisdom. In our empirical study we fortify these findings in the domain of fine-grained image classification. We show that very large Convolutional Neural Networks with millions of weights do learn with only a handful of training samples and without image augmentation, explicit regularization or pretraining. We train the architectures ResNet018, ResNet101 and VGG19 on subsets of the difficult benchmark datasets Caltech101, CUB_200_2011, FGVCAircraft, Flowers102 and StanfordCars with 100 classes and more, perform a comprehensive comparative study and draw implications for the practical application of CNNs. Finally, we show that a randomly initialized VGG19 with 140 million weights learns to distinguish airplanes and motorbikes with up to 95% accuracy using only 20 training samples per class.
翻译:最近的调查结果显示,高度超分的神经神经网络在没有预先培训或明确正规化的情况下普遍使用高度超分的神经网络,在培训过程中出现零培训错误,即通过对培训数据进行记忆化来完全超标,这是令人惊讶的,因为它完全违背了传统的机器学习智慧。在我们的经验研究中,我们在微粒图像分类领域强化了这些发现。我们显示,具有数百万重量的庞大的革命神经网络确实只通过少量培训样本来学习,而没有增加图像、明确正规化或预培训。我们用每类只有20个培训样本来对困难的基准数据集Caltech101、CUB_200_2011、FGVCAcAcraft、Flowers102和斯坦福汽车的子集进行了ResNet018、ResNet101和VGG19进行了培训,以高达95%的精确度区分飞机和摩托车的随机初始VGG19。