Reinforcement learning requires skillful definition and remarkable computational efforts to solve optimization and control problems, which could impair its prospect. Introducing human guidance into reinforcement learning is a promising way to improve learning performance. In this paper, a comprehensive human guidance-based reinforcement learning framework is established. A novel prioritized experience replay mechanism that adapts to human guidance in the reinforcement learning process is proposed to boost the efficiency and performance of the reinforcement learning algorithm. To relieve the heavy workload on human participants, a behavior model is established based on an incremental online learning method to mimic human actions. We design two challenging autonomous driving tasks for evaluating the proposed algorithm. Experiments are conducted to access the training and testing performance and learning mechanism of the proposed algorithm. Comparative results against the state-of-the-arts suggest the advantages of our algorithm in terms of learning efficiency, performance, and robustness.
翻译:强化学习要求有技巧的定义和出色的计算努力,以解决可能损害其前景的优化和控制问题。在强化学习中引入人的指导是改善学习业绩的有希望的方法。本文件建立了全面的以人的指导为基础的强化学习框架。提出了在强化学习过程中适应人的指导的新的优先经验重播机制,以提高强化学习算法的效率和绩效。为了减轻人类参与者的沉重工作量,我们根据一种渐进的在线学习方法建立了行为模式,以模拟人类行为。我们设计了两种具有挑战性的自主驱动任务来评价拟议的算法。进行了实验,以利用拟议的算法的培训和测试性能和学习机制。与最新技术相比,比较结果显示我们的算法在学习效率、性能和稳健性方面的优势。