Autonomous driving has the potential to revolutionize mobility and is hence an active area of research. In practice, the behavior of autonomous vehicles must be acceptable, i.e., efficient, safe, and interpretable. While vanilla reinforcement learning (RL) finds performant behavioral strategies, they are often unsafe and uninterpretable. Safety is introduced through Safe RL approaches, but they still mostly remain uninterpretable as the learned behaviour is jointly optimized for safety and performance without modeling them separately. Interpretable machine learning is rarely applied to RL. This paper proposes SafeDQN, which allows to make the behavior of autonomous vehicles safe and interpretable while still being efficient. SafeDQN offers an understandable, semantic trade-off between the expected risk and the utility of actions while being algorithmically transparent. We show that SafeDQN finds interpretable and safe driving policies for a variety of scenarios and demonstrate how state-of-the-art saliency techniques can help to assess both risk and utility.
翻译:自主驾驶有可能使机动性发生革命,因此是一个活跃的研究领域。实际上,自主驾驶车辆的行为必须是可接受的,即高效、安全和可解释的。香草强化学习(RL)发现行为策略表现良好,但往往是不安全和无法解释的。安全是通过安全的RL方法引入的,但大部分情况下仍然无法解释,因为学到的行为是联合优化的,以安全和性能,而不分别建模。解释机器学习很少适用于RL。本文提出“安全DQN”,这样可以使自主驾驶车辆的行为既安全又可以解释,同时仍然有效。“安全DQN”提供了一种可以理解的、语义性的交换,在预期风险与行动的效用之间,同时又在算法上透明。我们表明,SafeDQN发现,各种情景都可解释和安全的驾驶政策,并展示最先进的技术如何帮助评估风险和效用。