Square convolution is a default unit in convolutional neural networks as it fits well on the tensor computation for convolution operation, which usually has a fixed N x N receptive field (RF). However, what matters most to the network is the effective receptive field (ERF), which indicates the extent each pixel contributes to the output. ERF shows a Gaussian distribution and can not be modeled by simply sampling pixels with offsets. To simulate ERF, we propose a Gaussian Mask convolutional kernel (GMConv) in this work. Specifically, GMConv utilizes the Gaussian function to generate a concentric symmetry mask and put the mask over the kernel to refine the RF. Our GMConv can directly replace the standard convolutions in existing CNNs and can be easily trained end-to-end by standard backpropagation. Extensive experiments on multiple image classification benchmark datasets show that our method is comparable to, and outperforms in many cases, the standard convolution. For instance, using GMConv for AlexNet and ResNet-50, the top-1 accuracy on ImageNet classification is boosted by 0.98% and 0.85%, respectively.
翻译:电磁共振是革命神经网络中一个默认单元,因为它在革命操作的加速计算中非常适合。 电磁共振通常有一个固定的 N x N 接收场( RF ) 。 但是, 网络最重要的问题是有效的接收场( ERF), 显示每个像素对产出的贡献程度。 电磁共振显示高斯分布, 无法通过简单的像素取样来模拟。 为了模拟 ERF, 我们提议在这项工作中设置一个高萨面具共振核心( GMConv ) 。 具体地说, GMConv利用高萨函数生成一个共振相对称掩罩, 并将掩罩置于内核上以完善RF 。 我们的GMConv可以直接取代现有CNN的标准演进量, 并且可以通过标准的反向调整来很容易地训练成端端端。 在多个图像分类基准数据集上进行广泛的实验, 显示我们的方法与许多情况下的标准变异。 例如, 使用GMConv和ResNet- 50 0. 0. 和 0. 8 和图像网络的顶端- 的精确度分别由 0. 0. 0. 0. 和 0. 0. 0. 8 和 和 1/ 图像 的图像网络 的 分别以 0. 0. 0. 和 0. 的 0. 0. 8 5 的 的 的 的 推进 。