Assistive robot arms try to help their users perform everyday tasks. One way robots can provide this assistance is shared autonomy. Within shared autonomy, both the human and robot maintain control over the robot's motion: as the robot becomes confident it understands what the human wants, it intervenes to automate the task. But how does the robot know these tasks in the first place? State-of-the-art approaches to shared autonomy often rely on prior knowledge. For instance, the robot may need to know the human's potential goals beforehand. During long-term interaction these methods will inevitable break down -- sooner or later the human will attempt to perform a task that the robot does not expect. Accordingly, in this paper we formulate an alternate approach to shared autonomy that learns assistance from scratch. Our insight is that operators repeat important tasks on a daily basis (e.g., opening the fridge, making coffee). Instead of relying on prior knowledge, we therefore take advantage of these repeated interactions to learn assistive policies. We introduce SARI, an algorithm that recognizes the human's task, replicates similar demonstrations, and returns control when unsure. We then combine learning with control to demonstrate that the error of our approach is uniformly ultimately bounded. We perform simulations to support this error bound, compare our approach to imitation learning baselines, and explore its capacity to assist for an increasing number of tasks. Finally, we conduct three user studies with industry-standard methods and shared autonomy baselines, including a pilot test with a disabled user. Our results indicate that learning shared autonomy across repeated interactions matches existing approaches for known tasks and outperforms baselines on new tasks. See videos of our user studies here: https://youtu.be/3vE4omSvLvc
翻译:辅助机器人武器试图帮助其用户完成日常任务。 机器人可以提供这种协助的一种方式是共享自主。 在共享自主性的范围内, 人类和机器人都可以提供这种协助。 在共享自主性的范围内, 人类和机器人都保持对机器人运动的控制 。 因此, 在共享自主性的范围内, 当机器人相信自己理解人类想要的东西时, 它会干预任务自动化 。 但是机器人如何知道这些任务呢? 共享自主性的最先进方法通常依赖于先前的知识。 例如, 机器人可能需要事先了解人类的潜在目标。 比如, 长期互动中, 这些方法会不可避免地崩溃 -- 迟早, 人类会试图执行机器人无法预料的任务。 因此, 在本文中, 我们设计了一种共享自主性的方法, 从零到零到零到零到零到零的学习。 我们的洞察看, 操作者每天重复的重要任务( 例如,打开冰箱,做咖啡) 。 我们因此利用这些反复的交互性的互动来学习辅助政策。 我们引入了一种已知的算法, 承认我们人类的任务, 复制类似的演示, 当不确定的时候, 返回控制。 我们在这里学习和不断校准的用户的校准的校正的校正的校准, 最后的校正的校正的校正的校正, 。 学习一种校正的校正的校正的校正的校正的校正的校正的校正。