Commercial and industrial deployments of robot fleets often fall back on remote human teleoperators during execution when robots are at risk or unable to make task progress. With continual learning, interventions from the remote pool of humans can also be used to improve the robot fleet control policy over time. A central question is how to effectively allocate limited human attention to individual robots. Prior work addresses this in the single-robot, single-human setting. We formalize the Interactive Fleet Learning (IFL) setting, in which multiple robots interactively query and learn from multiple human supervisors. We present a fully implemented open-source IFL benchmark suite of GPU-accelerated Isaac Gym environments for the evaluation of IFL algorithms. We propose Fleet-DAgger, a family of IFL algorithms, and compare a novel Fleet-DAgger algorithm to 4 baselines in simulation. We also perform 1000 trials of a physical block-pushing experiment with 4 ABB YuMi robot arms. Experiments suggest that the allocation of humans to robots significantly affects robot fleet performance, and that our algorithm achieves up to 8.8x higher return on human effort than baselines. See https://tinyurl.com/fleet-dagger for code, videos, and supplemental material.
翻译:在机器人面临风险或无法取得任务进展时,机器人机队的商业和工业部署往往会落在远程人类遥控器上。通过不断学习,远程人类群的干预也可以用来改进机器人机队的长期控制政策。一个中心问题是如何有效地将有限的人类注意力分配给个体机器人。先前的工作在单机器人、单人环境下解决这个问题。我们正式确定了互动式机队学习(IFL)设置,其中多个机器人交互查询并从多个人类督导员那里学习。我们提出了一个完全实施的开放源的IFL基准套GPU-加速IFL基准套件,用于评估IFL算法。我们提议了FL算法的车队-Dagger(IFL算法的家族),并将新的机队-Dagger算法与模拟中的4个基线进行比较。我们还对4个ABB Yumi机器人武器进行了1 000次物理阻击试验。实验表明,将人类分配给机器人会极大地影响机器人机队的性能,我们的算法计算方法比基线、 http://sublietal/subleal com.