监督性学习控制框架,为未充分操作的合作社机器人任务制定自主和实时任务规划监督性学习控制框架 (A Supervisory Learning Control Framework for Autonomous & Real-time Task Planning for an Underactuated Cooperative Robotic task)

We introduce a framework for cooperative manipulation, applied on an underactuated manipulation problem. Two stationary robotic manipulators are required to cooperate in order to reposition an object within their shared work space. Control of multi-agent systems for manipulation tasks cannot rely on individual control strategies with little to no communication between the agents that serve the common objective through swarming. Instead a coordination strategy is required that queries subtasks to the individual agents. We formulate the problem in a Task And Motion Planning (TAMP) setting, while considering a decomposition strategy that allows us to treat the task and motion planning problems separately. We solve the supervisory planning problem offline using deep Reinforcement Learning techniques resulting into a supervisory policy capable of coordinating the two manipulators into a successful execution of the pick-and-place task. Additionally, a benefit of solving the task planning problem offline is the possibility of real-time (re)planning, demonstrating robustness in the event of subtask execution failure or on-the-fly task changes. The framework achieved zero-shot deployment on the real setup with a success rate that is higher than 90%.

翻译：我们引入了合作操纵框架,用于处理未充分激活的操纵问题。两台固定的机器人操纵器需要合作,以便在其共享的工作空间内重新定位一个物体。控制多试剂操作任务不能依赖个人控制战略,而通过升温服务于共同目标的代理商之间则很少或根本没有沟通。相反,需要有一个协调战略,向单个代理商查询子任务。我们在任务和动作规划(TAMP)的设置中提出问题,同时考虑分解战略,使我们能够分别处理任务和动作规划问题。我们用深强化学习技术解决监管规划问题,通过监督政策,能够协调两个操作商成功执行选址任务。此外,解决脱机任务规划问题的好处是实时(再)规划的可能性,在子任务执行失败或飞行任务变化时显示稳健性。这个框架在实际设置上实现了零点部署,成功率超过90%。