We present CompoSuite, an open-source simulated robotic manipulation benchmark for compositional multi-task reinforcement learning (RL). Each CompoSuite task requires a particular robot arm to manipulate one individual object to achieve a task objective while avoiding an obstacle. This compositional definition of the tasks endows CompoSuite with two remarkable properties. First, varying the robot/object/objective/obstacle elements leads to hundreds of RL tasks, each of which requires a meaningfully different behavior. Second, RL approaches can be evaluated specifically for their ability to learn the compositional structure of the tasks. This latter capability to functionally decompose problems would enable intelligent agents to identify and exploit commonalities between learning tasks to handle large varieties of highly diverse problems. We benchmark existing single-task, multi-task, and compositional learning algorithms on various training settings, and assess their capability to compositionally generalize to unseen tasks. Our evaluation exposes the shortcomings of existing RL approaches with respect to compositionality and opens new avenues for investigation.
翻译:我们提出了“CompoSetet”,这是一个用于组成多任务强化学习的开放源码模拟机器人操纵基准(RL)。每个CompoSet 任务要求一个特定的机器人臂来操纵一个单个物体,以达到任务目标,同时避免障碍。这种任务配置定义包含两个显著的属性。首先,不同的机器人/目标/目标/障碍元素导致数百项RL任务,其中每个元素都需要有意义的不同行为。第二,RL方法可以具体评估其学习任务构成结构的能力。后者功能分解问题的能力将使智能剂能够识别和利用学习任务之间的共性,以便处理大种类的非常不同的问题。我们把现有的单项任务、多任务和组成学习算法用于各种培训环境,并评估其组成能力,以便概括到不可见的任务。我们的评估揭示了现有RL方法在组成方面的种种缺点,并开辟了新的调查途径。