This article surveys reinforcement learning approaches in social robotics. Reinforcement learning is a framework for decision-making problems in which an agent interacts through trial-and-error with its environment to discover an optimal behavior. Since interaction is a key component in both reinforcement learning and social robotics, it can be a well-suited approach for real-world interactions with physically embodied social robots. The scope of the paper is focused particularly on studies that include social physical robots and real-world human-robot interactions with users. We present a thorough analysis of reinforcement learning approaches in social robotics. In addition to a survey, we categorize existent reinforcement learning approaches based on the used method and the design of the reward mechanisms. Moreover, since communication capability is a prominent feature of social robots, we discuss and group the papers based on the communication medium used for reward formulation. Considering the importance of designing the reward function, we also provide a categorization of the papers based on the nature of the reward. This categorization includes three major themes: interactive reinforcement learning, intrinsically motivated methods, and task performance-driven methods. The benefits and challenges of reinforcement learning in social robotics, evaluation methods of the papers regarding whether or not they use subjective and algorithmic measures, a discussion in the view of real-world reinforcement learning challenges and proposed solutions, the points that remain to be explored, including the approaches that have thus far received less attention is also given in the paper. Thus, this paper aims to become a starting point for researchers interested in using and applying reinforcement learning methods in this particular research field.
翻译:文章调查加强了社会机器人的学习方法。强化学习是一个决策问题框架,使代理机构通过试验和磨损与环境相互作用,发现最佳行为。由于互动是强化学习和社会机器人的一个关键组成部分,它可以成为现实世界与体格化社会机器人互动的合适方法。本文的范围特别侧重于研究,包括社会物理机器人和现实世界人类机器人与用户的互动。我们对社会机器人中的强化学习方法进行了透彻分析。除了调查外,我们还根据使用的方法和奖赏机制的设计,对现有的强化学习方法进行了分类。此外,由于互动是强化社会机器人学习和社会机器人的一个突出特征,因此,我们讨论和分组了基于用于奖励的传播媒介的文件。考虑到设计奖赏功能的重要性,我们还根据奖赏的性质对文件进行分类。这种分类包括互动强化学习学习、内在动机方法以及任务驱动方法。在社会机器人中加强学习的现有方法的好处和挑战是,在开始的实地研究中采用这种研究方法的优势和挑战,因此,在研究过程中采用这种选择的主观方法,因此,在研究过程中采用这种研究方法的主观性研究方法,因此,在研究中采用这种研究方法时,在研究中,在研究中采用这种研究过程中采用这种研究方法的主观方法时,在学习方法时采用这种研究方法时,因此学习方法仍然是一种较不甚深层次。