Uncovering potential failure cases is a crucial step in the validation of safety critical systems such as autonomous vehicles. Failure search may be done through logging substantial vehicle miles in either simulation or real world testing. Due to the sparsity of failure events, naive random search approaches require significant amounts of vehicle operation hours to find potential system weaknesses. As a result, adaptive searching techniques have been proposed to efficiently explore and uncover failure trajectories of an autonomous policy in simulation. Adaptive Stress Testing (AST) is one such method that poses the problem of failure search as a Markov decision process and uses reinforcement learning techniques to find high probability failures. However, this formulation requires a probability model for the actions of all agents in the environment. In systems where the environment actions are discrete and dependencies among agents exist, it may be infeasible to fully characterize the distribution or find a suitable proxy. This work proposes the use of a data driven approach to learn a suitable classifier that tries to model how humans identify {critical states and use this to guide failure search in AST. We show that the incorporation of critical states into the AST framework generates failure scenarios with increased safety violations in an autonomous driving policy with a discrete action space.
翻译:摘要: 发现潜在的故障情况是验证自主驾驶汽车等安全关键系统的关键步骤。可以通过在仿真或实际测试中记录大量车辆里程来进行故障搜索。由于故障事件稀疏,因此朴素的随机搜索方法需要大量的车辆运行小时数才能找到潜在的系统弱点。因此,提出了自适应搜索技术,以在仿真中高效地探索和发现自主策略中的故障轨迹。自适应压力测试(AST)是一种方法,将故障搜索问题作为马尔可夫决策过程提出,并使用强化学习技术来找到高概率故障。然而,这种表述需要对环境中所有代理的动作进行概率模型。在环境动作是离散的且代理之间存在依赖性的系统中,完全描述分布或找到合适的代理可能是不可行的。本文提出了一种数据驱动方法,学习适合的分类器,尝试模拟人类如何识别临界状态,并将其用于指导AST中的故障搜索。我们表明,将关键状态纳入AST框架可以生成具有增加的安全违规性的故障场景,其中包括具有离散行动空间的自主驾驶策略。