Collecting training data from the physical world is usually time-consuming and even dangerous for fragile robots, and thus, recent advances in robot learning advocate the use of simulators as the training platform. Unfortunately, the reality gap between synthetic and real visual data prohibits direct migration of the models trained in virtual worlds to the real world. This paper proposes a modular architecture for tackling the virtual-to-real problem. The proposed architecture separates the learning model into a perception module and a control policy module, and uses semantic image segmentation as the meta representation for relating these two modules. The perception module translates the perceived RGB image to semantic image segmentation. The control policy module is implemented as a deep reinforcement learning agent, which performs actions based on the translated image segmentation. Our architecture is evaluated in an obstacle avoidance task and a target following task. Experimental results show that our architecture significantly outperforms all of the baseline methods in both virtual and real environments, and demonstrates a faster learning curve than them. We also present a detailed analysis for a variety of variant configurations, and validate the transferability of our modular architecture.
翻译:从物理世界收集培训数据通常耗费时间,甚至对脆弱的机器人来说也是危险的,因此,最近机器人学习的进展倡导使用模拟器作为培训平台。不幸的是,合成和实际视觉数据之间的现实差距使得在虚拟世界中培训的模型无法直接迁移到真实世界。本文件提出了解决虚拟到现实问题的模块架构。拟议架构将学习模型分为一个感知模块和一个控制政策模块,并将语义图像分割作为与这两个模块有关的元数据表示。感知模块将想象到的 RGB 图像转换为语义图像分割。控制政策模块作为深度强化学习代理实施,根据被翻译的图像分割进行行动。我们的结构在避免障碍的任务和任务之后的一项目标中得到评估。实验结果表明,我们的架构在虚拟和真实环境中都大大超越了所有基线方法,并展示了比它们更快的学习曲线。我们还对各种变异配置进行了详细分析,并验证了模块架构的可转移性。