Vision-based autonomous urban driving in dense traffic is quite challenging due to the complicated urban environment and the dynamics of the driving behaviors. Widely-applied methods either heavily rely on hand-crafted rules or learn from limited human experience, which makes them hard to generalize to rare but critical scenarios. In this paper, we present a novel CAscade Deep REinforcement learning framework, CADRE, to achieve model-free vision-based autonomous urban driving. In CADRE, to derive representative latent features from raw observations, we first offline train a Co-attention Perception Module (CoPM) that leverages the co-attention mechanism to learn the inter-relationships between the visual and control information from a pre-collected driving dataset. Cascaded by the frozen CoPM, we then present an efficient distributed proximal policy optimization framework to online learn the driving policy under the guidance of particularly designed reward functions. We perform a comprehensive empirical study with the CARLA NoCrash benchmark as well as specific obstacle avoidance scenarios in autonomous urban driving tasks. The experimental results well justify the effectiveness of CADRE and its superiority over the state-of-the-art by a wide margin.
翻译:由于城市环境复杂,驾驶行为动态变化,以密集交通为主的视觉自主城市驱动相当具有挑战性。广泛应用的方法要么严重依赖手工制作的规则,要么学习有限的人的经验,因此很难将其概括为稀有但关键的情景。在本文中,我们介绍了一个新的CAscade 深强化学习框架,即CADRE,以在特别设计的奖赏功能指导下实现无模式的视觉自主城市驱动。在CADRE,为了从原始观测中得出具有代表性的潜伏特征,我们首先在网上培训了一个“共同注意概念模块 ”, 利用共同注意机制从预先收集的驾驶数据集中学习视觉和控制信息的相互关系。在被冻结的COPM中,我们提出了一个高效分布的准XI政策优化框架,以在特别设计的奖赏功能指导下在线学习驱动政策。我们用CARLA NoCrash基准进行了全面的经验研究,并在自主城市驾驶任务中进行了具体的避免障碍设想。实验结果充分证明CADRE的实效及其在州之上的优势。