Simulation-based testing of autonomous vehicles (AVs) has become an essential complement to road testing to ensure safety. Consequently, substantial research has focused on searching for failure scenarios in simulation. However, a fundamental question remains: are AV failure scenarios identified in simulation meaningful in reality, i.e., are they reproducible on the real system? Due to the sim-to-real gap arising from discrepancies between simulated and real sensor data, a failure scenario identified in simulation can be either a spurious artifact of the synthetic sensor data or an actual failure that persists with real sensor data. An approach to validate simulated failure scenarios is to identify instances of the scenario in a corpus of real data, and check if the failure persists on the real data. To this end, we propose a formal definition of what it means for a labelled data item to match an abstract scenario, encoded as a scenario program using the SCENIC probabilistic programming language. Using this definition, we develop a querying algorithm which, given a scenario program and a labelled dataset, finds the subset of data matching the scenario. Experiments demonstrate that our algorithm is accurate and efficient on a variety of realistic traffic scenarios, and scales to a reasonable number of agents.
翻译:对自动飞行器(AVs)进行模拟测试已成为确保安全的公路测试的必要补充,因此,大量研究的重点是在模拟中寻找故障假设情景。然而,一个根本问题仍然存在:模拟中发现的AV故障假设情景在现实中是否具有实际意义,即在实际系统上是否可复制?由于模拟数据和真实传感器数据之间差异产生的模拟到实际差距,模拟中发现的故障假设情景可以是合成传感器数据的一个假冒的遗迹,或者是与真实传感器数据长期存在的实际故障。验证模拟故障假设情景的方法是在真实数据中查明假设情景的事例,并检查实际数据中是否继续存在故障。为此,我们提出正式定义,说明贴标签数据项目与抽象假设情景相匹配的含义,并用SCENIC的概率性、不稳定性编程语言编码成一个假想方案。使用这一定义,我们开发了一种查询算法,根据假想方案和贴标签数据集,找到与假设情景相匹配的数据的子。实验表明,我们的算法在现实交通代理方的各种假设情景、尺度和比例上,我们的算法是准确和有效的。