Humans have internal models of robots (like their physical capabilities), the world (like what will happen next), and their tasks (like a preferred goal). However, human internal models are not always perfect: for example, it is easy to underestimate a robot's inertia. Nevertheless, these models change and improve over time as humans gather more experience. Interestingly, robot actions influence what this experience is, and therefore influence how people's internal models change. In this work we take a step towards enabling robots to understand the influence they have, leverage it to better assist people, and help human models more quickly align with reality. Our key idea is to model the human's learning as a nonlinear dynamical system which evolves the human's internal model given new observations. We formulate a novel optimization problem to infer the human's learning dynamics from demonstrations that naturally exhibit human learning. We then formalize how robots can influence human learning by embedding the human's learning dynamics model into the robot planning problem. Although our formulations provide concrete problem statements, they are intractable to solve in full generality. We contribute an approximation that sacrifices the complexity of the human internal models we can represent, but enables robots to learn the nonlinear dynamics of these internal models. We evaluate our inference and planning methods in a suite of simulated environments and an in-person user study, where a 7DOF robotic arm teaches participants to be better teleoperators. While influencing human learning remains an open problem, our results demonstrate that this influence is possible and can be helpful in real human-robot interaction.
翻译:人类拥有机器人的内部模型(比如其物理能力)、世界(比如接下来会发生什么)、以及他们的任务(比如下一个目标 ) 。然而,人类的内部模型并非总是完美的:例如,很容易低估机器人的惯性。然而,随着人类积累更多的经验,这些模型会随着时间而变化和改进。有趣的是,机器人行动会影响这种经验,从而影响人们的内部模型的变化。在这项工作中,我们迈出了一步,使机器人能够理解其影响,利用它来更好地帮助人们,并帮助人类模型更迅速地适应现实。然而,我们的关键思想是将人类的学习模型建成一个非线性动态系统,这种系统可以改变人类的内部模型。我们设计了一个新的优化问题来推断人类从自然展示人类学习的演示中学习动态。然后我们确定机器人如何通过将人类学习的动态模型嵌入机器人的规划问题来影响人类的学习。尽管我们的公式提供了具体的问题说明,但能够更迅速地解决整个问题。我们的关键思想是将人类的学习模式的复杂程度转化为我们内部的模型的模型,而使机器人的模型的模型成为了一种不易变的模型。