Despite the advances in the autonomous driving domain, autonomous vehicles (AVs) are still inefficient and limited in terms of cooperating with each other or coordinating with vehicles operated by humans. A group of autonomous and human-driven vehicles (HVs) which work together to optimize an altruistic social utility -- as opposed to the egoistic individual utility -- can co-exist seamlessly and assure safety and efficiency on the road. Achieving this mission without explicit coordination among agents is challenging, mainly due to the difficulty of predicting the behavior of humans with heterogeneous preferences in mixed-autonomy environments. Formally, we model an AV's maneuver planning in mixed-autonomy traffic as a partially-observable stochastic game and attempt to derive optimal policies that lead to socially-desirable outcomes using a multi-agent reinforcement learning framework. We introduce a quantitative representation of the AVs' social preferences and design a distributed reward structure that induces altruism into their decision making process. Our altruistic AVs are able to form alliances, guide the traffic, and affect the behavior of the HVs to handle competitive driving scenarios. As a case study, we compare egoistic AVs to our altruistic autonomous agents in a highway merging setting and demonstrate the emerging behaviors that lead to a noticeable improvement in the number of successful merges as well as the overall traffic flow and safety.
翻译:尽管在自主驾驶领域取得了进步,自治车辆在相互合作或与人驾驶的车辆协调方面仍然效率低下,而且有限。一组自主和人驾驶的车辆(HV)共同努力优化利他社会效用 -- -- 而不是自我主义个人效用 -- -- 能够无缝共存,确保道路上的安全和高效。实现这一任务具有挑战性,主要原因是难以预测在混合自治环境中具有不同偏好的人的行为。形式上,我们模拟AV在混合自主交通中操纵规划,作为一种部分可观测的随机游戏,试图利用多试剂强化学习框架制定最佳政策,导致社会理想成果。我们采用AV的社会偏好定量代表制,并设计一个分配奖励结构,引导利他主义的决策过程。我们的利他主义AV能够形成联盟,指导交通流动,影响HV处理竞争性交通行为的行为,从而在主流交通中将AVicentimical行为与正统的自身行为主体进行对比,将AVicentalimical begrodual developmental developical developical developmental ex in ex in exmstrubilal ex subromals presmstract ex ex ex subromals presm ex in ex ex ex ex subrmals presmismismismismismismismismitaltistrualtistrualitalizationaltistrualtistritaltistritaltistritalitalital exital exital exital exital exital exital ex ex ex exital exitaltistrital exital sual ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex exital exital ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex ex