Autonomous agents that drive on roads shared with human drivers must reason about the nuanced interactions among traffic participants. This poses a highly challenging decision making problem since human behavior is influenced by a multitude of factors (e.g., human intentions and emotions) that are hard to model. This paper presents a decision making approach for autonomous driving, focusing on the complex task of merging into moving traffic where uncertainty emanates from the behavior of other drivers and imperfect sensor measurements. We frame the problem as a partially observable Markov decision process (POMDP) and solve it online with Monte Carlo tree search. The solution to the POMDP is a policy that performs high-level driving maneuvers, such as giving way to an approaching car, keeping a safe distance from the vehicle in front or merging into traffic. Our method leverages a model learned from data to predict the future states of traffic while explicitly accounting for interactions among the surrounding agents. From these predictions, the autonomous vehicle can anticipate the future consequences of its actions on the environment and optimize its trajectory accordingly. We thoroughly test our approach in simulation, showing that the autonomous vehicle can adapt its behavior to different situations. We also compare against other methods, demonstrating an improvement with respect to the considered performance metrics.
翻译:自动代理驾驶在与人类驾驶员共享道路时必须考虑交通参与者之间微妙的互动。这提出了一个非常具有挑战性的决策制定问题,因为人类行为受到多种因素的影响(例如,人类意图和情感),这些因素很难建模。本文针对自主驾驶的决策制定方法进行了研究,重点研究了合并到移动交通中的复杂任务,其中不确定性来自其他驾驶员的行为和不完美的传感器测量。我们将问题作为部分可观察的马尔可夫决策过程(POMDP)来框架,并使用蒙特卡罗树搜索进行在线解决。 POMDP的解决方案是执行高级驾驶动作的策略,例如让路给接近的汽车,与前方车辆保持安全距离或合并进入交通。我们的方法利用从数据中学习的模型来预测未来的交通状态,同时明确考虑周围代理之间的交互。从这些预测中,自动车可以预测其动作对环境的未来影响并相应地优化其轨迹。我们在仿真中彻底测试了我们的方法,表明自主驾驶车辆可以适应不同的情况。 我们还与其他方法进行了比较,展示了与考虑的性能指标的改进。