Self-driving vehicles have their own intelligence to drive on open roads. However, vehicle managers, e.g., government or industrial companies, still need a way to tell these self-driving vehicles what behaviors are encouraged or forbidden. Unlike human drivers, current self-driving vehicles cannot understand the traffic laws, thus rely on the programmers manually writing the corresponding principles into the driving systems. It would be less efficient and hard to adapt some temporary traffic laws, especially when the vehicles use data-driven decision-making algorithms. Besides, current self-driving vehicle systems rarely take traffic law modification into consideration. This work aims to design a road traffic law adaptive decision-making method. The decision-making algorithm is designed based on reinforcement learning, in which the traffic rules are usually implicitly coded in deep neural networks. The main idea is to supply the adaptability to traffic laws of self-driving vehicles by a law-adaptive backup policy. In this work, the natural language-based traffic laws are first translated into a logical expression by the Linear Temporal Logic method. Then, the system will try to monitor in advance whether the self-driving vehicle may break the traffic laws by designing a long-term RL action space. Finally, a sample-based planning method will re-plan the trajectory when the vehicle may break the traffic rules. The method is validated in a Beijing Winter Olympic Lane scenario and an overtaking case, built in CARLA simulator. The results show that by adopting this method, the self-driving vehicles can comply with new issued or updated traffic laws effectively. This method helps self-driving vehicles governed by digital traffic laws, which is necessary for the wide adoption of autonomous driving.
翻译:自动驾驶车辆具有自身的智能,能够在开放的道路上行驶。然而车辆管理者(例如政府或工业企业)仍需一种方式告诉这些自动驾驶车辆鼓励或禁止哪些行为。与人类驾驶员不同,当前的自动驾驶车辆无法理解交通法规,因此依赖程序员将相应的原则手动编写到驾驶系统中。这种方法效率低下,很难适应一些暂时性交通法规,尤其是当车辆使用数据驱动的决策算法时。此外,目前的自动驾驶车辆系统很少考虑交通法规的修改。本文旨在设计道路交通法自适应决策方法。该决策算法基于强化学习设计,其中交通规则通常隐含编码在深度神经网络中。主要思想是通过采用适应法律的备用策略为自动驾驶车辆提供适应性。在这项工作中,首先使用线性时态逻辑方法将基于自然语言的交通法规翻译为逻辑表达式。然后,系统将尝试提前监控自动驾驶车辆是否可能违反交通规则,设计一个长期的强化学习行动空间。最后,通过基于样本的规划方法,在车辆可能违反交通规则时重新规划轨迹。该方法在CARLA模拟器中构建了一个北京冬季奥运会车道场景和一个超车案例进行验证。结果表明,采用该方法,自动驾驶车辆能够有效地遵守新发布的或更新的交通法规。这种方法有助于数字交通法规统治下的自动驾驶车辆,这对自主驾驶的广泛应用是必要的。