Initialization of parameters in deep neural networks has been shown to have a big impact on the performance of the networks (Mishkin & Matas, 2015). The initialization scheme devised by He et al, allowed convolution activations to carry a constrained mean which allowed deep networks to be trained effectively (He et al., 2015a). Orthogonal initializations and more generally orthogonal matrices in standard recurrent networks have been proved to eradicate the vanishing and exploding gradient problem (Pascanu et al., 2012). Majority of current initialization schemes do not take fully into account the intrinsic structure of the convolution operator. Using the duality of the Fourier transform and the convolution operator, Convolution Aware Initialization builds orthogonal filters in the Fourier space, and using the inverse Fourier transform represents them in the standard space. With Convolution Aware Initialization we noticed not only higher accuracy and lower loss, but faster convergence. We achieve new state of the art on the CIFAR10 dataset, and achieve close to state of the art on various other tasks.
翻译:深神经网络参数的初始化对网络的性能产生了重大影响(Mishkin & Matas, 2015年)。He等人设计的初始化计划允许革命启动带有一定的内涵,使深网络得到有效培训(He等人,2015年a)。在标准的经常性网络中,正向初始化以及更普遍的正向矩阵已证明消除了消失和爆炸的梯度问题(Pascanu等人,2012年)。当前初始化计划的多数没有充分考虑到熔化运营商的内在结构。利用Fourier变换和熔化运营商的双重性,革命意识初始化在Fourier空间建立或交接过滤器,使用反向的Fourier变换在标准空间中代表了这些网络。随着革命初始化,我们不仅注意到更高的准确性和较低的损失,而且更快地发现。我们在CIFAR10数据集上实现了艺术的新状态,并在其他各项任务上接近艺术状态。