A central question of machine learning is how deep nets manage to learn tasks in high dimensions. An appealing hypothesis is that they achieve this feat by building a representation of the data where information irrelevant to the task is lost. For image datasets, this view is supported by the observation that after (and not before) training, the neural representation becomes less and less sensitive to diffeomorphisms acting on images as the signal propagates through the net. This loss of sensitivity correlates with performance, and surprisingly correlates with a gain of sensitivity to white noise acquired during training. These facts are unexplained, and as we demonstrate still hold when white noise is added to the images of the training set. Here, we (i) show empirically for various architectures that stability to image diffeomorphisms is achieved by both spatial and channel pooling, (ii) introduce a model scale-detection task which reproduces our empirical observations on spatial pooling and (iii) compute analitically how the sensitivity to diffeomorphisms and noise scales with depth due to spatial pooling. The scalings are found to depend on the presence of strides in the net architecture. We find that the increased sensitivity to noise is due to the perturbing noise piling up during pooling, after being rectified by ReLU units.
翻译:机器学习的中心问题是,深网如何在高维度上学习任务。一个引人入胜的假设是,它们通过建立与任务无关的信息丢失的数据的表达方式取得了这一成就。对于图像数据集来说,这种观点得到以下观察的支持:在(而不是在之前)培训之后,神经表象对在图像上作为信号通过网络传播的信号而作用的异己体现象越来越不敏感。这种敏感度的丧失与性能有关,与培训期间获取的对白噪音的敏感度有惊人的关联。这些事实是无法解释的,而且当将白色噪音添加到成套培训的图像中时,我们仍能保持这种成就。在这里,我们(一)从经验上显示,各种结构在空间和通道集合上都能够稳定地貌形态的稳定性,(二)引入一个模型分级探测任务,以复制我们在空间集合上的经验观测结果,以及(三)在空间集合的深度下,对对对对异己面形和噪声度的敏感度如何进行反调。我们发现,规模的扩大取决于在网络结构中是否具有超度,在稳定度之后,我们发现在不断调整的敏感度。