Simulation to real (Sim-to-Real) is an attractive approach to construct controllers for robotic tasks that are easier to simulate than to analytically solve. Working Sim-to-Real solutions have been demonstrated for tasks with a clear single objective such as "reach the target". Real world applications, however, often consist of multiple simultaneous objectives such as "reach the target" but "avoid obstacles". A straightforward solution in the context of reinforcement learning (RL) is to combine multiple objectives into a multi-term reward function and train a single monolithic controller. Recently, a hybrid solution based on pre-trained single objective controllers and a switching rule between them was proposed. In this work, we compare these two approaches in the multi-objective setting of a robot manipulator to reach a target while avoiding an obstacle. Our findings show that the training of a hybrid controller is easier and obtains a better success-failure trade-off than a monolithic controller. The controllers trained in simulator were verified by a real set-up.
翻译:模拟到真实( 模拟到真实) 是构建机器人任务控制器的一种有吸引力的方法, 模拟比分析更容易解决。 工作模拟到真实( 模拟到真实) 的解决方案已经展示出来, 任务具有明确的单一目标( 如“ 达到目标 ” ) 。 然而, 真实世界应用通常包含多重同时目标, 如“ 达到目标 ”, 但“ 避免障碍 ” 。 在强化学习( RL) 的背景下, 一个直接的解决办法是将多个目标结合到一个多期奖励功能中, 并训练一个单一的单一控制器。 最近, 提出了一个基于预先训练过的单一目标控制器的混合解决方案, 以及它们之间的转换规则 。 在这项工作中, 我们在机器人操纵器的多目标设置中比较这两种方法, 以达到目标, 同时避免障碍 。 我们的研究结果显示, 混合控制器的训练比单一控制器更容易, 获得更好的成功- 交易。 模拟器中训练的控制器得到真实的验证 。