Human impressions of robot performance are often measured through surveys. As a more scalable and cost-effective alternative, we investigate the possibility of predicting people's impressions of robot behavior using non-verbal behavioral cues and machine learning techniques. To this end, we first contribute the SEAN TOGETHER Dataset consisting of observations of an interaction between a person and a mobile robot in a VR simulation, together with impressions of robot performance provided by users on a 5-point scale. Second, we contribute analyses of how well humans and supervised learning techniques can predict perceived robot performance based on different observation types (like facial expression features, and features that describe the navigation behavior of the robot and pedestrians). Our results suggest that facial expressions alone provide useful information about human impressions of robot performance; but in the navigation scenarios that we considered, reasoning about spatial features in context is critical for the prediction task. Also, supervised learning techniques showed promise because they outperformed humans' predictions of robot performance in most cases. Further, when predicting robot performance as a binary classification task on unseen users' data, the F1 Score of machine learning models more than doubled in comparison to predicting performance on a 5-point scale. This suggested that the models can have good generalization capabilities, although they are better at telling the directionality of robot performance than predicting exact performance ratings. Based on our findings in simulation, we conducted a real-world demonstration in which a mobile robot uses a machine learning model to predict how a human that follows it perceives it. Finally, we discuss the implications of our results for implementing such supervised learning models in real-world navigation scenarios.
翻译:暂无翻译