In human-robot collaboration, the objectives of the human are often unknown to the robot. Moreover, even assuming a known objective, the human behavior is also uncertain. In order to plan a robust robot behavior, a key preliminary question is then: How to derive realistic human behaviors given a known objective? A major issue is that such a human behavior should itself account for the robot behavior, otherwise collaboration cannot happen. In this paper, we rely on Markov decision models, representing the uncertainty over the human objective as a probability distribution over a finite set of objective functions (inducing a distribution over human behaviors). Based on this, we propose two contributions: 1) an approach to automatically generate an uncertain human behavior (a policy) for each given objective function while accounting for possible robot behaviors; and 2) a robot planning algorithm that is robust to the above-mentioned uncertainties and relies on solving a partially observable Markov decision process (POMDP) obtained by reasoning on a distribution over human behaviors. A co-working scenario allows conducting experiments and presenting qualitative and quantitative results to evaluate our approach.
翻译:在人类机器人合作中,人类的目标往往为机器人所不知。此外,即使假设一个已知的目标,人类的行为也是不确定的。为了规划一个稳健的机器人行为,一个关键的初步问题是:如何得出符合现实的人类行为,并给出一个已知的目标?一个主要问题是,这种人类行为本身应该说明机器人的行为,否则合作就无法发生。在本文中,我们依赖Markov决定模型,这是人类目标的不确定性,代表着一组有限的客观功能(导致人类行为的分布)的概率分布。在此基础上,我们提出两种贡献:(1) 一种办法是,在计算可能的机器人行为的同时,为每个特定客观功能自动产生不确定的人类行为(政策);和(2) 一种机器人计划算法,该算法对于上述不确定因素是可靠的,并依靠对人类行为分布进行推理获得的可部分可观测的Markov决定过程(POMDP ) 。根据共同工作设想,可以进行实验,提出定性和定量结果来评价我们的做法。</s>