Deep reinforcement learning (DRL) provides a promising way for learning navigation in complex autonomous driving scenarios. However, identifying the subtle cues that can indicate drastically different outcomes remains an open problem with designing autonomous systems that operate in human environments. In this work, we show that explicitly inferring the latent state and encoding spatial-temporal relationships in a reinforcement learning framework can help address this difficulty. We encode prior knowledge on the latent states of other drivers through a framework that combines the reinforcement learner with a supervised learner. In addition, we model the influence passing between different vehicles through graph neural networks (GNNs). The proposed framework significantly improves performance in the context of navigating T-intersections compared with state-of-the-art baseline approaches.
翻译:深层强化学习(DRL)为在复杂的自主驱动情景下学习导航提供了很有希望的方法。然而,在设计在人类环境中运行的自主系统时,发现能够显示巨大不同结果的微妙线索仍然是尚未解决的一个问题。在这项工作中,我们表明,在强化学习框架内明确推断潜伏状态和将空间-时空关系编码有助于解决这一难题。我们通过一个将强化学习者与受监督的学习者相结合的框架,将关于其他驱动者潜伏状态的先前知识编码成一个框架。此外,我们用图表神经网络(GNN)来模拟不同车辆之间传递的影响。拟议框架大大改进了与最新基线方法相比的导航截面的性能。