Planning under social interactions with other agents is an essential problem for autonomous driving. As the actions of the autonomous vehicle in the interactions affect and are also affected by other agents, autonomous vehicles need to efficiently infer the reaction of the other agents. Most existing approaches formulate the problem as a generalized Nash equilibrium problem solved by optimization-based methods. However, they demand too much computational resource and easily fall into the local minimum due to the non-convexity. Monte Carlo Tree Search (MCTS) successfully tackles such issues in game-theoretic problems. However, as the interaction game tree grows exponentially, the general MCTS still requires a huge amount of iterations to reach the optima. In this paper, we introduce an efficient game-theoretic trajectory planning algorithm based on general MCTS by incorporating a prediction algorithm as a heuristic. On top of it, a social-compliant reward and a Bayesian inference algorithm are designed to generate diverse driving behaviors and identify the other driver's driving preference. Results demonstrate the effectiveness of the proposed framework with datasets containing naturalistic driving behavior in highly interactive scenarios.
翻译:与其它代理商的社会互动规划是自主驾驶的一个基本问题。 随着自主汽车在互动中的行动影响其他代理商,并受到其他代理商的影响,自主汽车需要有效地推断其他代理商的反应。大多数现有方法将这一问题表述为通过优化方法解决的普遍纳什平衡问题。然而,它们要求过多的计算资源,并且由于不相通而很容易地落到当地最低水平。蒙特卡洛树搜索(MCTS)成功地解决了游戏理论问题中的此类问题。然而,随着互动游戏树的迅速增长,一般 MCTS仍然需要大量的迭代才能到达optima。在本文中,我们采用了基于一般 MCTS 的高效游戏理论轨迹规划算法,将预测算法作为超常法。除此之外,一个符合社会要求的奖赏和拜斯语推论算法旨在产生多种多样的驱动行为,并确定其他司机的驱动偏好。结果显示拟议框架的有效性,其中含有高度互动情况下的自然驱动行为。