We address the problem of solving complex bimanual robot manipulation tasks on multiple objects with sparse rewards. Such complex tasks can be decomposed into sub-tasks that are accomplishable by different robots concurrently or sequentially for better efficiency. While previous reinforcement learning approaches primarily focus on modeling the compositionality of sub-tasks, two fundamental issues are largely ignored particularly when learning cooperative strategies for two robots: (i) domination, i.e., one robot may try to solve a task by itself and leaves the other idle; (ii) conflict, i.e., one robot can easily interrupt another's workspace when executing different sub-tasks simultaneously. To tackle these two issues, we propose a novel technique called disentangled attention, which provides an intrinsic regularization for two robots to focus on separate sub-tasks and objects. We evaluate our method on four bimanual manipulation tasks. Experimental results show that our proposed intrinsic regularization successfully avoids domination and reduces conflicts for the policies, which leads to significantly more effective cooperative strategies than all the baselines. Our project page with videos is at https://mehooz.github.io/bimanual-attention.
翻译:我们处理的问题是,如何解决对多种物体的复杂二元机器人操纵任务,但回报微薄。这种复杂任务可以分解成一个子任务,由不同的机器人同时或相继完成,以提高效率。虽然以前的强化学习方法主要侧重于子任务构成的建模,但两个基本问题在很大程度上被忽视,特别是在学习两个机器人的合作战略时:(一) 支配,即一个机器人可能试图自己解决任务,而留下另一个闲置;(二) 冲突,即一个机器人在同时执行不同的子任务时可以很容易地中断另一个工作空间。为了解决这两个问题,我们提议一种新颖的技术,即引起不连贯的关注,为两个机器人提供一种内在的正规化,以关注不同的子任务和物体。我们评估了我们关于四个两元操纵任务的方法。实验结果显示,我们提议的内在规范成功地避免了对政策的控制并减少了冲突,这导致比所有基线都更为有效的合作战略。我们的项目网页的视频是https://mehouz.githubio/bimanal-stain。