Multi-robot manipulation tasks involve various control entities that can be separated into dynamically independent parts. A typical example of such real-world tasks is dual-arm manipulation. Learning to naively solve such tasks with reinforcement learning is often unfeasible due to the sample complexity and exploration requirements growing with the dimensionality of the action and state spaces. Instead, we would like to handle such environments as multi-agent systems and have several agents control parts of the whole. However, decentralizing the generation of actions requires coordination across agents through a channel limited to information central to the task. This paper proposes an approach to coordinating multi-robot manipulation through learned latent action spaces that are shared across different agents. We validate our method in simulated multi-robot manipulation tasks and demonstrate improvement over previous baselines in terms of sample efficiency and learning performance.
翻译:多机器人操纵任务涉及各种可分为动态独立部分的控制实体。这种现实世界任务的一个典型例子是双重武器操纵。学习用强化学习来天真地完成这种任务往往不可行,因为样本的复杂性和探索要求随着行动和国家空间的维度而增长。相反,我们希望处理多剂系统等环境,并拥有多个代理控制整体部分。然而,将行动的产生权下放需要通过一个仅限于对任务至关重要的信息的渠道,在代理人之间进行协调。本文提出通过不同代理人之间共享的学习的潜在行动空间来协调多机器人操纵的方法。我们验证模拟多机器人操作任务的方法,并表明在抽样效率和学习绩效方面比以往的基准有所改善。