Simulation-to-reality transfer has emerged as a popular and highly successful method to train robotic control policies for a wide variety of tasks. However, it is often challenging to determine when policies trained in simulation are ready to be transferred to the physical world. Deploying policies that have been trained with very little simulation data can result in unreliable and dangerous behaviors on physical hardware. On the other hand, excessive training in simulation can cause policies to overfit to the visual appearance and dynamics of the simulator. In this work, we study strategies to automatically determine when policies trained in simulation can be reliably transferred to a physical robot. We specifically study these ideas in the context of robotic fabric manipulation, in which successful sim2real transfer is especially challenging due to the difficulties of precisely modeling the dynamics and visual appearance of fabric. Results in a fabric smoothing task suggest that our switching criteria correlate well with performance in real. In particular, our confidence-based switching criteria achieve average final fabric coverage of 87.2-93.7% within 55-60% of the total training budget. See https://tinyurl.com/lsc-case for code and supplemental materials.
翻译:模拟到现实的转让已成为一种为各种任务培训机器人控制政策的流行和高度成功的方法,但往往很难确定在模拟方面受过训练的政策何时可以转移到实际世界。采用经过训练的模拟数据很少的政策可能会在物理硬件上造成不可靠和危险的行为。另一方面,过度的模拟培训可能导致政策过分适应模拟器的视觉外观和动态。在这项工作中,我们研究各种战略,自动确定何时可以可靠地将经过模拟训练的政策转让给物理机器人。我们在机器人结构操作方面特别研究这些想法,在机械结构操作方面,由于难以精确模拟结构的动态和视觉外观,成功的模拟设备转移尤其具有挑战性。在结构上取得的结果表明,我们的转换标准与实际表现密切相关。特别是,我们基于信任的转换标准在55-60%的培训预算总额中实现了872-93.7%的平均最终结构覆盖率。见https://tinyurl.com/lscccccase for code and requiduperal produment 和补充材料。