Simulation has recently become key for deep reinforcement learning to safely and efficiently acquire general and complex control policies from visual and proprioceptive inputs. Tactile information is not usually considered despite its direct relation to environment interaction. In this work, we present a suite of simulated environments tailored towards tactile robotics and reinforcement learning. A simple and fast method of simulating optical tactile sensors is provided, where high-resolution contact geometry is represented as depth images. Proximal Policy Optimisation (PPO) is used to learn successful policies across all considered tasks. A data-driven approach enables translation of the current state of a real tactile sensor to corresponding simulated depth images. This policy is implemented within a real-time control loop on a physical robot to demonstrate zero-shot sim-to-real policy transfer on several physically-interactive tasks requiring a sense of touch.
翻译:最近,模拟已成为深层强化学习的关键,以便安全和高效地从视觉和自主感知投入中获取一般和复杂的控制政策。尽管适应信息与环境互动有直接的关系,但通常不考虑该信息。在这项工作中,我们展示了一套针对触觉机器人和强化学习的模拟环境。提供了一种简单而快速的模拟光学触觉传感器方法,其中高分辨率接触几何作为深度图像。正方位政策优化(PPPO)用于在所有考虑的任务中学习成功的政策。一种数据驱动方法能够将真实触动传感器的现状转换为相应的模拟深度图像。该政策是在物理机器人实时控制循环中实施的,以展示需要触觉感的几项物理互动任务上的零弹光光到真实的政策转移。