To act in the world, robots rely on a representation of salient task aspects: for example, to carry a cup of coffee, a robot must consider movement efficiency and cup orientation in its behaviour. However, if we want robots to act for and with people, their representations must not be just functional but also reflective of what humans care about, i.e. their representations must be aligned with humans'. In this survey, we pose that current reward and imitation learning approaches suffer from representation misalignment, where the robot's learned representation does not capture the human's representation. We suggest that because humans will be the ultimate evaluator of robot performance in the world, it is critical that we explicitly focus our efforts on aligning learned task representations with humans, in addition to learning the downstream task. We advocate that current representation learning approaches in robotics should be studied from the perspective of how well they accomplish the objective of representation alignment. To do so, we mathematically define the problem, identify its key desiderata, and situate current robot learning methods within this formalism. We conclude the survey by suggesting future directions for exploring open challenges.
翻译:为了在世界上采取行动,机器人依赖一个突出任务方面的代表:例如,为了携带一杯咖啡,机器人必须考虑运动效率和杯式行为方向。然而,如果我们想让机器人为人和与人一起行动,机器人的表现不能只是功能性的,而且必须反映人类关心的事物,即他们的表现必须与人类相一致。在本次调查中,我们提出,目前的奖赏和模仿学习方法存在代表性不相称的情况,因为机器人所学的代表性不能捕捉人类的代表性。我们建议,由于人类将是世界机器人表现的最终评估者,我们除了学习下游任务外,还必须明确集中努力与人类协调学习的任务表现。我们主张,从机器人目前的代表性学习方法应该从他们如何达到代表性调整目标的角度加以研究。为了这样做,我们从数学角度来界定问题,确定它的关键不相容,并将目前的机器人学习方法置于这一形式中。我们通过提出探索公开挑战的未来方向来结束调查。