While classic control theory offers state of the art solutions in many problem scenarios, it is often desired to improve beyond the structure of such solutions and surpass their limitations. To this end, \emph{\gls{rpl}} offers a formulation to improve existing controllers with reinforcement learning (RL) by learning an additive "residual" to the output of a given controller. However, the applicability of such an approach highly depends on the structure of the controller. Often, internal feedback signals of the controller limit an RL algorithm to adequately change the policy and, hence, learn the task. We propose a new formulation that addresses these limitations by also modifying the feedback signals to the controller with an RL policy and show superior performance of our approach on a contact-rich peg-insertion task under position and orientation uncertainty. In addition, we use a recent impedance control architecture as control framework and show the difficulties of standard RPL. Furthermore, we introduce an adaptive curriculum for the given task to gradually increase the task difficulty in terms of position and orientation uncertainty. A video showing the results can be found at https://youtu.be/SAZm_Krze7U .
翻译:虽然经典控制理论在许多问题情景中提供了最先进的解决方案,但通常希望改进这些解决方案的结构,超越这些解决方案的结构,超越其局限性。为此, emph=gls{rpl ⁇ 提供一种配方,通过学习对特定控制器输出的添加“剩余”来改进现有控制器。然而,这种方法的适用性在很大程度上取决于控制器的结构。 控制器的内部反馈信号往往限制RL算法,以充分改变政策,从而了解任务。我们建议一种新的配方,通过用RL政策修改给控制器的反馈信号,并显示我们在定位和方向不确定的情况下对接触-丰富peg-插入任务的方法的优异性。此外,我们使用最近的阻力控制架构作为控制框架,并显示标准的RPL的困难。此外,我们为给定的任务引入了适应性课程,以逐步增加定位和方向不确定性方面的任务难度。我们可以在 https://youtu.be/SAZm_Krze7U上找到显示结果的视频。