Batch Whitening is a technique that accelerates and stabilizes training by transforming input features to have a zero mean (Centering) and a unit variance (Scaling), and by removing linear correlation between channels (Decorrelation). In commonly used structures, which are empirically optimized with Batch Normalization, the normalization layer appears between convolution and activation function. Following Batch Whitening studies have employed the same structure without further analysis; even Batch Whitening was analyzed on the premise that the input of a linear layer is whitened. To bridge the gap, we propose a new Convolutional Unit that is in line with the theory, and our method generally improves the performance of Batch Whitening. Moreover, we show the inefficacy of the original Convolutional Unit by investigating rank and correlation of features. As our method is employable off-the-shelf whitening modules, we use Iterative Normalization (IterNorm), the state-of-the-art whitening module, and obtain significantly improved performance on five image classification datasets: CIFAR-10, CIFAR-100, CUB-200-2011, Stanford Dogs, and ImageNet. Notably, we verify that our method improves stability and performance of whitening when using large learning rate, group size, and iteration number.
翻译:批发白化是一种技术,它通过转换输入功能来加速和稳定培训,使输入特性具有零平均值(进入)和单位差异(缩放),并消除各频道之间的线性关联(分级关系),从而加速和稳定培训。在通常使用的结构中,通过批批发正常化的优化,正常化层在卷发和激活功能之间出现。在批发白化研究之后,在没有进一步分析的情况下采用了同样的结构;对甚至批发白化进行了分析,其前提是线性层的输入是白白的。为了缩小差距,我们提议建立一个符合理论的新的革命单位,我们的方法普遍改进了批发白化的绩效。此外,我们通过调查特征的等级和相关性来显示原革命股的不有效性。由于我们的方法是可使用现出的白化模块,我们使用了“异性正常化”(IterNorm),即最先进的白化模块,并在五个图像分类数据集上取得了显著改进的性能:CIFAR-10,CIFAR-100,CUB-200-NBAR-NBAR-BAR-BAR-BAR-BAR-BAR-BAR-BAR-BAR-BAR 和BAR-BAR-BAR-BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_S_BAR_BAR_BAR_BAR_BAR_S_S_S_S_SD_BAR_BAR_BAR_BAR_SBAR_SBAR_S_SDAR_BAR_BAR_BAR_S_S_BAR_BAR_S_S_BAR_BAR_BAR_BAR_BAR_BAR_S_S_SD_SD_SD_SBAR_SBAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_S_S_S_S_S_S_BAR_BAR_BAR_BAR_BAR_BAR_BAR_BAR_S_S_BAR_