We analyze the role of rotational equivariance in convolutional neural networks (CNNs) applied to spherical images. We compare the performance of the group equivariant networks known as S2CNNs and standard non-equivariant CNNs trained with an increasing amount of data augmentation. The chosen architectures can be considered baseline references for the respective design paradigms. Our models are trained and evaluated on single or multiple items from the MNIST or FashionMNIST dataset projected onto the sphere. For the task of image classification, which is inherently rotationally invariant, we find that by considerably increasing the amount of data augmentation and the size of the networks, it is possible for the standard CNNs to reach at least the same performance as the equivariant network. In contrast, for the inherently equivariant task of semantic segmentation, the non-equivariant networks are consistently outperformed by the equivariant networks with significantly fewer parameters. We also analyze and compare the inference latency and training times of the different networks, enabling detailed tradeoff considerations between equivariant architectures and data augmentation for practical problems. The equivariant spherical networks used in the experiments are available at https://github.com/JanEGerken/sem_seg_s2cnn .
翻译:我们分析了用于球形图像的神经神经网络(CNNs)中旋转等差的作用。我们比较了称为S2CNNNs和标准非等差CNNs的集团等差网络的性能,并培训了越来越多的数据增强量。选择的架构可以被视为各自设计范式的基准参考。我们的模型在从MNIST或时装MINDIS数据集投射到球体上的单项或多项项目上经过培训和评价。关于图像分类的任务,这是内在的旋转变异性,我们发现,通过大大增加数据增强量和网络规模,标准CNNs至少可以达到与等差网络至少相同的性能。相比之下,对于内在的等差性分化任务,非等差性网络总是被以显著更少的参数预测的网格网格变异化。我们还分析并比较了不同网络的变异性和培训时间,从而使得在Qequity_feral2 网络中进行详细的贸易考虑。