Recent advances in Reinforcement Learning (RL) combined with Deep Learning (DL) have demonstrated impressive performance in complex tasks, including autonomous driving. The use of RL agents in autonomous driving leads to a smooth human-like driving experience, but the limited interpretability of Deep Reinforcement Learning (DRL) creates a verification and certification bottleneck. Instead of relying on RL agents to learn complex tasks, we propose HPRL - Hierarchical Program-triggered Reinforcement Learning, which uses a hierarchy consisting of a structured program along with multiple RL agents, each trained to perform a relatively simple task. The focus of verification shifts to the master program under simple guarantees from the RL agents, leading to a significantly more interpretable and verifiable implementation as compared to a complex RL agent. The evaluation of the framework is demonstrated on different driving tasks, and NHTSA precrash scenarios using CARLA, an open-source dynamic urban simulation environment.
翻译:在强化学习(RL)和深层学习(DL)方面最近取得的进展表明,在包括自主驾驶在内的复杂任务方面表现令人印象深刻。在自主驾驶中使用RL代理导致人性化驾驶经历平滑,但深层强化学习(DRL)的可解释性有限造成了一个核查和认证瓶颈。我们提议,与其依靠RL代理人学习复杂任务,还不如依靠HPRL-高层次方案触发强化学习,它使用由结构化程序以及多个RL代理人组成的等级,每个机构都受过执行相对简单任务的培训。核查的重点转移到主程序,由RL代理人的简单担保,导致与复杂的RL代理人相比,可解释和可核查的实施大大增强。对框架的评估展示了不同的驾驶任务,以及使用开放源的动态城市模拟环境CARLA(开放源城市模拟环境)的NHTSA(NTSA)崩溃前情景。