The translational equivariant nature of Convolutional Neural Networks (CNNs) is a reason for its great success in computer vision. However, networks do not enjoy more general equivariance properties such as rotation or scaling, ultimately limiting their generalization performance. To address this limitation, we devise a method that endows CNNs with simultaneous equivariance with respect to translation, rotation, and scaling. Our approach defines a convolution-like operation and ensures equivariance based on our proposed scalable Fourier-Argand representation. The method maintains similar efficiency as a traditional network and hardly introduces any additional learnable parameters, since it does not face the computational issue that often occurs in group-convolution operators. We validate the efficacy of our approach in the image classification task, demonstrating its robustness and the generalization ability to both scaled and rotated inputs.
翻译:进化神经网络(CNNs)的翻译等同性质是其计算机愿景取得巨大成功的一个原因,然而,网络并不享有更普遍的等同性质,例如轮换或缩放,最终限制其一般性能。为解决这一限制,我们设计了一种方法,使CNN在翻译、轮换和缩放方面同时具有等同性。我们的方法定义了类似变异的操作,并根据我们提议的可缩放的Fourier-Argand 代表制确保了等同性。这种方法保持了与传统网络类似的效率,几乎没有引入任何其他可学习的参数,因为它没有遇到集团革命操作者经常遇到的计算问题。我们验证了我们在图像分类任务中的方法的有效性,显示了其稳健性和缩放和旋转投入的普及能力。</s>