We propose a novel antialiasing method to increase shift invariance in convolutional neural networks (CNNs). More precisely, we replace the conventional combination "real-valued convolutions + max pooling" ($\mathbb R$Max) by "complex-valued convolutions + modulus" ($\mathbb C$Mod), which produce stable feature representations for band-pass filters with well-defined orientations. In a recent work, we proved that, for such filters, the two operators yield similar outputs. Therefore, $\mathbb C$Mod can be viewed as a stable alternative to $\mathbb R$Max. To separate band-pass filters from other freely-trained kernels, in this paper, we designed a "twin" architecture based on the dual-tree complex wavelet packet transform, which generates similar outputs as standard CNNs with fewer trainable parameters. In addition to improving stability to small shifts, our experiments on AlexNet and ResNet showed increased prediction accuracy on natural image datasets such as ImageNet and CIFAR10. Furthermore, our approach outperformed recent antialiasing methods based on low-pass filtering by preserving high-frequency information, while reducing memory usage.
翻译:我们提出了一种新的反丑化方法来增加进化神经网络(CNNs)中的变异性。更准确地说,我们用“复合的变异 +模版” (mathbb C$Mod) 取代常规组合“实际价值的共变+最大集合” (mathbb R$max),用“复合价值的共变+模版” (mathbb C$Mod) 取代传统组合的“实际价值的共变+最大集合” ($\mathbb R$Max), 使带宽过滤器具有定义明确方向的带宽过滤器具有稳定的特征。 在最近的一项工作中,我们证明,对于这样的过滤器,两个操作员产生了类似的产出。 因此, $\mathbb C$Mod 可以被视为一个稳定的替代品。 为了将带宽过滤器从其他自由训练的内核内核中分离出来, 我们设计了一个基于双树复合波包转换的“双向”结构结构, 产生类似标准的CNNNIS, 且培训参数较少。 除了提高的参数之外, 我们在AlexNet 和ResNet ResNet 的实验显示, 我们的实验还显示, 在图像网络和ResNet 等自然图像网络和ICFAR10 等自然图像数据设置的预测的预测准确性数据设置, 。