Robotic systems are more present in our society everyday. In human-robot environments, it is crucial that end-users may correctly understand their robotic team-partners, in order to collaboratively complete a task. To increase action understanding, users demand more explainability about the decisions by the robot in particular situations. Recently, explainable robotic systems have emerged as an alternative focused not only on completing a task satisfactorily, but also on justifying, in a human-like manner, the reasons that lead to making a decision. In reinforcement learning scenarios, a great effort has been focused on providing explanations using data-driven approaches, particularly from the visual input modality in deep learning-based systems. In this work, we focus rather on the decision-making process of reinforcement learning agents performing a task in a robotic scenario. Experimental results are obtained using 3 different set-ups, namely, a deterministic navigation task, a stochastic navigation task, and a continuous visual-based sorting object task. As a way to explain the goal-driven robot's actions, we use the probability of success computed by three different proposed approaches: memory-based, learning-based, and introspection-based. The difference between these approaches is the amount of memory required to compute or estimate the probability of success as well as the kind of reinforcement learning representation where they could be used. In this regard, we use the memory-based approach as a baseline since it is obtained directly from the agent's observations. When comparing the learning-based and the introspection-based approaches to this baseline, both are found to be suitable alternatives to compute the probability of success, obtaining high levels of similarity when compared using both the Pearson's correlation and the mean squared error.
翻译:在人类机器人环境中,终端用户必须正确地理解其机器人团队伙伴的视觉输入模式,以便合作完成一项任务。为了提高行动理解度,用户要求更多解释机器人在特定情况下做出的决定。最近,可解释的机器人系统作为一种替代系统出现,不仅侧重于令人满意地完成一项任务,而且侧重于以类似人类的方式说明导致作出决定的原因。在强化学习情景中,一项巨大的努力侧重于使用数据驱动的方法提供解释,特别是从深层次学习系统中的视觉输入模式提供解释。在这项工作中,我们侧重于在机器人情景中执行任务的强化学习代理的决策过程。实验结果使用三种不同的设置,即确定性导航任务,随机导航任务,以及持续的视觉排序任务。作为解释目标驱动机器人行动的一种平均值,我们通过三种不同的拟议方法来比较成功的可能性:以记忆为基础、基于学习基础的基线观测,以及作为高概率分析基础的缩略图,我们从这些基础学习中学习到存储的缩略图数量。