代表问题:改进机器人的认知和探索 (Representation Matters: Improving Perception and Exploration for Robotics)

Markus Wulfmeier,Arunkumar Byravan,Tim Hertweck,Irina Higgins,Ankush Gupta,Tejas Kulkarni,Malcolm Reynolds,Denis Teplyashin,Roland Hafner,Thomas Lampe,Martin Riedmiller

from arxiv, Published at ICRA 2021

Projecting high-dimensional environment observations into lower-dimensional structured representations can considerably improve data-efficiency for reinforcement learning in domains with limited data such as robotics. Can a single generally useful representation be found? In order to answer this question, it is important to understand how the representation will be used by the agent and what properties such a 'good' representation should have. In this paper we systematically evaluate a number of common learnt and hand-engineered representations in the context of three robotics tasks: lifting, stacking and pushing of 3D blocks. The representations are evaluated in two use-cases: as input to the agent, or as a source of auxiliary tasks. Furthermore, the value of each representation is evaluated in terms of three properties: dimensionality, observability and disentanglement. We can significantly improve performance in both use-cases and demonstrate that some representations can perform commensurate to simulator states as agent inputs. Finally, our results challenge common intuitions by demonstrating that: 1) dimensionality strongly matters for task generation, but is negligible for inputs, 2) observability of task-relevant aspects mostly affects the input representation use-case, and 3) disentanglement leads to better auxiliary tasks, but has only limited benefits for input representations. This work serves as a step towards a more systematic understanding of what makes a 'good' representation for control in robotics, enabling practitioners to make more informed choices for developing new learned or hand-engineered representations.

翻译：将高层次环境观测投射到低层次结构化的演示中,可以大大提高数据效率,用于在机器人等数据有限的领域加强学习;能否找到一个普遍有用的演示?为了回答这一问题,必须了解代理方将如何使用该演示体,以及这种“良好”演示体应具有哪些属性。在本文件中,我们系统地评估了在三种机器人任务(提升、堆叠和推动3D区块)背景下的一些共同学习和手工设计的演示体。这些演示体在两个使用案例中进行了评价:作为代理方的投入,或作为辅助任务的来源。此外,每个演示体的价值都从三个属性来评估:维度、易易腐性和混乱性。我们可以大大改进两个应用体的演示体的性,并表明某些演示体能可以与模拟体力国家相对应。最后,我们的成果挑战了共同直觉,表明:(1) 任务生成的维度非常强,但投入却微不足道,2)任务相关方面的可耐性大多影响投入方位化使用;此外,每个演示体能的值都从三个属性特性特性上评估了三个属性特性特性特性特性特性特性特性:维度:维度、可导致更深度、更精确化的演示力化的演示,但只能导致更精确的演化、更深化、更精确化的演示体化的演化的演化、更有利于的演化的演化的演化、更有助于发展。