We present SLOT-V, a novel supervised learning framework that learns observer models (human preferences) from robot motion trajectories in a legibility context. Legibility measures how easily a (human) observer can infer the robot's goal from a robot motion trajectory. When generating such trajectories, existing planners often rely on an observer model that estimates the quality of trajectory candidates. These observer models are frequently hand-crafted or, occasionally, learned from demonstrations. Here, we propose to learn them in a supervised manner using the same data format that is frequently used during the evaluation of aforementioned approaches. We then demonstrate the generality of SLOT-V using a Franka Emika in a simulated manipulation environment. For this, we show that it can learn to closely predict various hand-crafted observer models, i.e., that SLOT-V's hypothesis space encompasses existing handcrafted models. Next, we showcase SLOT-V's ability to generalize by showing that a trained model continues to perform well in environments with unseen goal configurations and/or goal counts. Finally, we benchmark SLOT-V's sample efficiency (and performance) against an existing IRL approach and show that SLOT-V learns better observer models with less data. Combined, these results suggest that SLOT-V can learn viable observer models. Better observer models imply more legible trajectories, which may - in turn - lead to better and more transparent human-robot interaction.
翻译:我们提出SLOT-V,这是一个新的受监督的学习框架,在可辨性背景下从机器人运动轨迹中学习观察模型(人类偏好),从机器人运动轨迹中学习观察模型(人类偏好),(人类)观察者如何轻易从机器人运动轨迹中推断机器人的目标。当产生这种轨道轨迹时,现有规划者往往依靠一种观察模型来估计轨道候选者的质量。这些观察模型往往是手工制作的,或偶尔从演示中学习。在这里,我们提议以监督的方式学习这些模型,使用在评估上述方法期间经常使用的数据格式。然后我们展示SLOT-V的通用性能,在模拟操作环境中使用Franka Emilka的通用性能。在这方面,我们展示它能够学会如何密切地预测各种手工制作的观察员模型,即SLOT-V的假设空间包括现有的手工制作模型。我们展示SLOT-V的普及性模型,通过显示经过培训的模型在不见目标配置和/或目标计数的环境下继续运行。最后,我们把SLOT-OT-V-OI-I-I-I-OV-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-S-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-S-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-S-I-I-I-I-I-I-