Assistive robot arms try to help their users perform everyday tasks. One way robots can provide this assistance is shared autonomy. Within shared autonomy, both the human and robot maintain control over the robot's motion: as the robot becomes confident it understands what the human wants, it intervenes to automate the task. But how does the robot know these tasks in the first place? State-of-the-art approaches to shared autonomy often rely on prior knowledge. For instance, the robot may need to know the human's potential goals beforehand. During long-term interaction these methods will inevitably break down -- sooner or later the human will attempt to perform a task that the robot does not expect. Accordingly, in this paper we formulate an alternate approach to shared autonomy that learns assistance from scratch. Our insight is that operators repeat important tasks on a daily basis (e.g., opening the fridge, making coffee). Instead of relying prior knowledge, we therefore take advantage of these repeated interactions to learn assistive policies. We formalize an algorithm that recognizes the human's task, replicates similar demonstrations, and returns control when unsure. We then combine learning with control to demonstrate that the error of our approach is uniformly ultimately bounded. We perform simulations to support this error bound, compare our approach to imitation learning baselines, and explore its capacity to assist for an increasing number of tasks. Finally, we conduct a user study with industry-standard methods and shared autonomy baselines. Our results indicate that learning shared autonomy across repeated interactions (SARI) matches existing approaches for known goals, and outperforms the baselines on tasks that were never specified beforehand.
翻译:辅助机器人武器试图帮助其用户完成日常任务。 机器人可以提供这种帮助的一种方式是共享自主。 在共享自主性中, 人类和机器人都可以提供这种帮助。 在共享自主性中, 人类和机器人都可以对机器人的动作保持控制: 当机器人相信自己理解人类想要的东西时, 它会干预任务自动化。 但是机器人如何知道这些任务呢? 共享自主的最先进方法通常依赖于先前的知识。 例如, 机器人可能需要事先了解人类的潜在目标。 在长期互动中, 这些方法会不可避免地崩溃 -- 迟早, 人类会试图执行机器人无法预料的任务。 因此, 在本文中, 我们设计了一种共享自主性的方法, 从零开始学习援助。 我们的洞察力是操作者每天重复重要的任务( 例如打开冰箱, 制造咖啡 ) 。 因此, 我们利用这些重复的交互性的方法来学习辅助政策。 我们正式一种承认人类已知的方法, 复制相似的演示, 并在不确定时返回控制。 我们随后与控制一起学习如何从零开始学习共享的自主性, 学习我们的常规性方法, 从而最终学习我们的常规性 。