Deep convolutional networks have become the mainstream in computer vision applications. Although CNNs have been successful in many computer vision tasks, it is not free from drawbacks. The performance of CNN is dramatically degraded by geometric transformation, such as large rotations. In this paper, we propose a novel CNN architecture that can improve the robustness against geometric transformations without modifying the existing backbones of their CNNs. The key is to enclose the existing backbone with a geometric transformation (and the corresponding reverse transformation) and a feature map ensemble. The proposed method can inherit the strengths of existing CNNs that have been presented so far. Furthermore, the proposed method can be employed in combination with state-of-the-art data augmentation algorithms to improve their performance. We demonstrate the effectiveness of the proposed method using standard datasets such as CIFAR, CUB-200, and Mnist-rot-12k.
翻译:深相连网已成为计算机视觉应用的主流。 虽然有线电视新闻网在许多计算机视觉任务中取得了成功,但它并非没有缺陷。 有线电视网的性能由于几何转换(如大规模旋转)而急剧退化。在本文中,我们提议建立一个新型有线电视网架构,在不改变有线电视网现有主干线的情况下,加强抵御几何变形的能力。关键在于用几何转换(和相应的反向变换)和地貌图组合将现有主干线连接起来。拟议方法可以继承现有有线电视新闻网迄今为止的优势。此外,拟议方法可以与最新数据增强算法相结合,改善它们的业绩。我们展示了使用CIFAR、CUB-200和Mnist-rot-12k等标准数据集的拟议方法的有效性。