通过不确定性软件强化学习,为人类在行机器人剂做出决策</s> (Decision Making for Human-in-the-loop Robotic Agents via Uncertainty-Aware Reinforcement Learning)

In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed. However, knowing when to request such assistance is critical: too few requests can lead to the robot making mistakes, but too many requests can overload the expert. In this paper, we present a Reinforcement Learning based approach to this problem, where a semi-autonomous agent asks for external assistance when it has low confidence in the eventual success of the task. The confidence level is computed by estimating the variance of the return from the current state. We show that this estimate can be iteratively improved during training using a Bellman-like recursion. On discrete navigation problems with both fully- and partially-observable state information, we show that our method makes effective use of a limited budget of expert calls at run-time, despite having no access to the expert at training time.

翻译：在“人与人”模式中,机器人代理人能够以自主方式解决某项任务,但在需要时可以请求外部专家的帮助。然而,知道何时请求此类援助至关重要:太多的请求可能导致机器人犯错,但太多的请求会给专家造成过多的负担。在本文中,我们提出了一个基于强化学习的方法来解决这一问题,即当半自主代理人对任务最终成功缺乏信心时,可以请求外部援助。信任水平是通过估计从当前状态返回的差异来计算的。我们表明,在使用类似于贝尔曼的循环式循环式的培训中,这一估计可以反复改善。关于完全和部分可观测到的国家信息的离散导航问题,我们表明,我们的方法在运行时有效利用有限的专家通话预算,尽管在培训时无法与专家联系。</s>