Omnidirectional images and spherical representations of $3D$ shapes cannot be processed with conventional 2D convolutional neural networks (CNNs) as the unwrapping leads to large distortion. Using fast implementations of spherical and $SO(3)$ convolutions, researchers have recently developed deep learning methods better suited for classifying spherical images. These newly proposed convolutional layers naturally extend the notion of convolution to functions on the unit sphere $S^2$ and the group of rotations $SO(3)$ and these layers are equivariant to 3D rotations. In this paper, we consider the problem of unsupervised learning of rotation-invariant representations for spherical images. In particular, we carefully design an autoencoder architecture consisting of $S^2$ and $SO(3)$ convolutional layers. As 3D rotations are often a nuisance factor, the latent space is constrained to be exactly invariant to these input transformations. As the rotation information is discarded in the latent space, we craft a novel rotation-invariant loss function for training the network. Extensive experiments on multiple datasets demonstrate the usefulness of the learned representations on clustering, retrieval and classification applications.
翻译:光导图象和3D美元的球形图象无法与常规 2D 进化神经网络(CNNs) 进行处理,因为未包装的神经网络(CNNs) 会导致巨大的扭曲。使用快速执行球形图象和 $SO(3)美元的卷变,研究人员最近开发了更适合对球形图象进行分类的深层次学习方法。这些新提议的卷变层自然地将变换的概念扩大到单位域的功能S%2美元和旋转组3美元(3)美元,这些层的变换与3D旋转不等。在本文件中,我们考虑了不受监督地学习球形图象旋转-变换形图象学的问题。特别是,我们仔细设计了一个由 $S%2美元和 $SO(3)美元的卷变图层组成的自动编码结构。由于3D 旋转往往是一个扰动因素,因此潜伏空间只能与这些输入转换完全变换。由于循环信息在潜在空间中被丢弃,我们设计了一个新的旋转- 旋转- 损失功能功能, 用于培训网络的检索、 广泛实验 。