Autonomous vehicles need to handle various traffic conditions and make safe and efficient decisions and maneuvers. However, on the one hand, a single optimization/sampling-based motion planner cannot efficiently generate safe trajectories in real time, particularly when there are many interactive vehicles near by. On the other hand, end-to-end learning methods cannot assure the safety of the outcomes. To address this challenge, we propose a hierarchical behavior planning framework with a set of low-level safe controllers and a high-level reinforcement learning algorithm (H-CtRL) as a coordinator for the low-level controllers. Safety is guaranteed by the low-level optimization/sampling-based controllers, while the high-level reinforcement learning algorithm makes H-CtRL an adaptive and efficient behavior planner. To train and test our proposed algorithm, we built a simulator that can reproduce traffic scenes using real-world datasets. The proposed H-CtRL is proved to be effective in various realistic simulation scenarios, with satisfying performance in terms of both safety and efficiency.
翻译:自主车辆需要处理各种交通条件,作出安全有效的决定和操作。然而,一方面,单一的优化/抽样运动规划仪无法有效实时生成安全轨道,特别是附近有许多交互式车辆。另一方面,端到端学习方法不能保证结果的安全。为了应对这一挑战,我们提议了一个等级行为规划框架,配有一套低级安全控制器和高级别强化学习算法(H-CtRL),作为低级控制器的协调员。安全由低级优化/抽样控制器保证,而高级强化学习算法使H-CtRL成为适应性和效率强的行为规划器。为了培训和测试我们提议的算法,我们建立了一个模拟器,可以用真实世界的数据集复制交通场景。拟议的H-CtRL在各种现实的模拟假设中证明是有效的,在安全和效率方面都令人满意。