In order to provide adaptive and user-friendly solutions to robotic manipulation, it is important that the agent can learn to accomplish tasks even if they are only provided with very sparse instruction signals. To address the issues reinforcement learning algorithms face when task rewards are sparse, this paper proposes an intrinsic motivation approach that can be easily integrated into any standard reinforcement learning algorithm and can allow robotic manipulators to learn useful manipulation skills with only sparse extrinsic rewards. Through integrating and balancing empowerment and curiosity, this approach shows superior performance compared to other state-of-the-art intrinsic exploration approaches during extensive empirical testing. Qualitative analysis also shows that when combined with diversity-driven intrinsic motivations, this approach can help manipulators learn a set of diverse skills which could potentially be applied to other more complicated manipulation tasks and accelerate their learning process.
翻译:为了给机器人操纵提供适应性和方便用户的解决办法,重要的是代理人能够学会完成任务,即使只是提供非常稀少的指示信号。为了解决任务回报稀少时加强学习算法所面临的问题,本文件建议了一种内在动机方法,可以很容易地纳入任何标准的强化学习算法,并允许机器人操纵者学习有用的操纵技能,而只有很少的外部奖励。通过整合和平衡赋予权力和好奇心,这种方法与其他最先进的内在探索方法相比,在广泛的经验测试中表现优于其他最先进的内在探索方法。定性分析还表明,如果与多样性驱动的内在动机相结合,这一方法可以帮助操纵者学习一套不同的技能,这些技能可能被用于其他更为复杂的操作任务,并加速他们的学习进程。