We claim that Convolutional Neural Networks (CNNs) learn to classify images using only small seemingly unrecognizable tiles. We show experimentally that CNNs trained only using such tiles can match or even surpass the performance of CNNs trained on full images. Conversely, CNNs trained on full images show similar predictions on small tiles. We also propose the first a priori theoretical model for convolutional data sets that seems to explain this behavior. This gives additional support to the long standing suspicion that CNNs do not need to understand the global structure of images to achieve state-of-the-art accuracies. Surprisingly it also suggests that over-fitting is not needed either.
翻译:我们声称,革命神经网络(CNNs)只学会使用看似无法辨认的小瓷砖来对图像进行分类。 我们实验性地显示,仅使用这类瓷砖而接受培训的有线电视新闻网(CNN)能够匹配甚至超过完全图像培训的有线电视新闻网(CNN)的性能。 相反,接受过全面图像培训的有线电视新闻网(CNN)也展示了对小瓷砖的类似预测。 我们还提出了第一种先验的革命数据集理论模型,这似乎可以解释这种行为。 这进一步证实了长期存在的怀疑,即有线电视新闻网不需要了解全球图像结构以实现最先进的陶瓷。 令人惊讶的是,它也暗示不需要过度匹配。