Safety is a crucial property of every robotic platform: any control policy should always comply with actuator limits and avoid collisions with the environment and humans. In reinforcement learning, safety is even more fundamental for exploring an environment without causing any damage. While there are many proposed solutions to the safe exploration problem, only a few of them can deal with the complexity of the real world. This paper introduces a new formulation of safe exploration for reinforcement learning of various robotic tasks. Our approach applies to a wide class of robotic platforms and enforces safety even under complex collision constraints learned from data by exploring the tangent space of the constraint manifold. Our proposed approach achieves state-of-the-art performance in simulated high-dimensional and dynamic tasks while avoiding collisions with the environment. We show safe real-world deployment of our learned controller on a TIAGo++ robot, achieving remarkable performance in manipulation and human-robot interaction tasks.
翻译:安全是每个机器人平台的关键属性:任何控制政策都应始终遵守动画限制,避免与环境和人类发生碰撞。在强化学习中,安全对于在不造成任何损害的情况下探索环境更为重要。虽然安全勘探问题有许多拟议解决方案,但只有其中少数几个可以应对真实世界的复杂性。本文介绍了安全探索的新形式,以强化对各种机器人任务的学习。我们的方法适用于广泛的机器人平台,甚至在通过探索制约方形的相近空间而从数据中汲取的复杂的碰撞限制下实施安全。我们提议的方法在模拟高维和动态任务中取得了最先进的性能,同时避免了与环境的碰撞。我们展示了我们所学的操作员在TIAGO++机器人上的安全真实世界部署,在操作和人-机器人互动任务中取得了显著的成绩。