Recently, a variety of new equivariant neural network model architectures have been proposed that generalize better over rotational and reflectional symmetries than standard models. These models are relevant to robotics because many robotics problems can be expressed in a rotationally symmetric way. This paper focuses on equivariance over a visual state space and a spatial action space -- the setting where the robot action space includes a subset of $\rm{SE}(2)$. In this situation, we know a priori that rotations and translations in the state image should result in the same rotations and translations in the spatial action dimensions of the optimal policy. Therefore, we can use equivariant model architectures to make $Q$ learning more sample efficient. This paper identifies when the optimal $Q$ function is equivariant and proposes $Q$ network architectures for this setting. We show experimentally that this approach outperforms standard methods in a set of challenging manipulation problems.
翻译:最近,提出了各种新的等式神经网络模型结构,这些模型比标准模型的旋转和反反射对称更为普遍。这些模型与机器人相关,因为许多机器人问题可以用旋转对称方式表达。本文侧重于视觉状态空间和空间行动空间的等式,这是机器人动作空间包含一个子数$\rm{SE}(2)美元的设置。在这种情况下,我们先验地知道,国家图像的旋转和翻译应导致最佳政策的空间行动层面的相同的旋转和翻译。因此,我们可以使用等式模型结构来提高学习样本的效率。本文确定了最佳的$Q函数是等式的时,并为这一设置提出了$Q的网络结构。我们实验性地表明,这种方法在挑战操纵问题的一组中,超越了标准方法。