Much work in robotics has focused on "human-in-the-loop" learning techniques that improve the efficiency of the learning process. However, these algorithms have made the strong assumption of a cooperating human supervisor that assists the robot. In reality, human observers tend to also act in an adversarial manner towards deployed robotic systems. We show that this can in fact improve the robustness of the learned models by proposing a physical framework that leverages perturbations applied by a human adversary, guiding the robot towards more robust models. In a manipulation task, we show that grasping success improves significantly when the robot trains with a human adversary as compared to training in a self-supervised manner.
翻译:机器人的很多工作都集中在提高学习过程效率的“人与人之间交流”学习技巧上。然而,这些算法却有力地假定了一个合作的人类监督员来协助机器人。在现实中,人类观察家也倾向于对部署的机器人系统采取对抗的方式。我们表明,这实际上可以通过提出一个物理框架来提高所学模型的稳健性,这种框架能够利用人类对手的干扰,引导机器人建立更强大的模型。在一项操纵任务中,我们表明,当机器人用人与人对敌进行训练,而不是以自我监督的方式进行培训时,掌握成功会大有改进。