Deep reinforcement learning (DRL) has achieved groundbreaking successes in a wide variety of robotic applications. A natural consequence is the adoption of this paradigm for safety-critical tasks, where human safety and expensive hardware can be involved. In this context, it is crucial to optimize the performance of DRL-based agents while providing guarantees about their behavior. This paper presents a novel technique for incorporating domain-expert knowledge into a constrained DRL training loop. Our technique exploits the scenario-based programming paradigm, which is designed to allow specifying such knowledge in a simple and intuitive way. We validated our method on the popular robotic mapless navigation problem, in simulation, and on the actual platform. Our experiments demonstrate that using our approach to leverage expert knowledge dramatically improves the safety and the performance of the agent.
翻译:深入强化学习(DRL)在各种各样的机器人应用中取得了突破性的成功。自然而然的结果是采用了这一安全关键任务模式,其中涉及人的安全和昂贵的硬件。在这方面,至关重要的是优化基于DRL的代理人的性能,同时为其行为提供保障。本文介绍了将域专家知识纳入受限制的DRL培训循环的新技术。我们的技术利用了基于情景的方案拟订模式,该模式旨在以简单和直观的方式具体描述此类知识。我们验证了我们在流行的机器人无地图导航问题、模拟和实际平台上的方法。我们的实验表明,利用我们的方法利用专家知识极大地改善了该代理人的安全和性能。