Group Convolutional Neural Networks (G-CNNs) constrain learned features to respect the symmetries in the selected group, and lead to better generalization when these symmetries appear in the data. If this is not the case, however, equivariance leads to overly constrained models and worse performance. Frequently, transformations occurring in data can be better represented by a subset of a group than by a group as a whole, e.g., rotations in $[-90^{\circ}, 90^{\circ}]$. In such cases, a model that respects equivariance $\textit{partially}$ is better suited to represent the data. In addition, relevant transformations may differ for low and high-level features. For instance, full rotation equivariance is useful to describe edge orientations in a face, but partial rotation equivariance is better suited to describe face poses relative to the camera. In other words, the optimal level of equivariance may differ per layer. In this work, we introduce $\textit{Partial G-CNNs}$: G-CNNs able to learn layer-wise levels of partial and full equivariance to discrete, continuous groups and combinations thereof as part of training. Partial G-CNNs retain full equivariance when beneficial, e.g., for rotated MNIST, but adjust it whenever it becomes harmful, e.g., for classification of 6 / 9 digits or natural images. We empirically show that partial G-CNNs pair G-CNNs when full equivariance is advantageous, and outperform them otherwise.
翻译:组群神经网络( G- CNN) 限制学习到的功能以尊重选定组群的对称性, 并在数据出现这些对称性时导致更精确的概括化。 但是, 如果不是这样, 偏差会导致模型过度限制和性能更差。 通常, 数据中发生的变异可以由一个组群的子集比整个组更能代表, 例如, $[ 90 ⁇ circ} 、 90 ⁇ circ} 的旋转。 在这种情况下, 当数据出现时, 尊重等差 $\ textit{ 部分} 的模型更适合代表数据。 此外, 相关变异可能因低和高的特性而有所不同。 例如, 完全的旋转变异性能有助于描述一个组的优势方向, 但部分旋转的变异性更适合描述相相对摄像体的面。 换句话说, 最优的变异性水平可能不同层次。 在这项工作中, 当它完全变异性时, 我们引入了 equal e- e- e- equal e- equal- equal- dealalalalalalalalalaltrade, 只要它的全部 =全调 G- 和整个 G- dequal- 。 G- deval- 。 G- deal- dequal- dequal- 。