Behavior prediction remains one of the most challenging tasks in the autonomous vehicle (AV) software stack. Forecasting the future trajectories of nearby agents plays a critical role in ensuring road safety, as it equips AVs with the necessary information to plan safe routes of travel. However, these prediction models are data-driven and trained on data collected in real life that may not represent the full range of scenarios an AV can encounter. Hence, it is important that these prediction models are extensively tested in various test scenarios involving interactive behaviors prior to deployment. To support this need, we present a simulation-based testing platform which supports (1) intuitive scenario modeling with a probabilistic programming language called Scenic, (2) specifying a multi-objective evaluation metric with a partial priority ordering, (3) falsification of the provided metric, and (4) parallelization of simulations for scalable testing. As a part of the platform, we provide a library of 25 Scenic programs that model challenging test scenarios involving interactive traffic participant behaviors. We demonstrate the effectiveness and the scalability of our platform by testing a trained behavior prediction model and searching for failure scenarios.
翻译:行为预测仍然是自主飞行器(AV)软件堆放中最具挑战性的任务之一。预测附近物剂的未来轨迹在确保道路安全方面发挥着关键作用,因为它为AV提供了规划安全旅行路线的必要信息。然而,这些预测模型是数据驱动的,在真实生活中收集的数据方面受过培训,这些数据可能不代表AV可能遇到的所有情景。因此,这些预测模型必须在涉及部署前互动行为的各种测试情景中广泛测试。为了支持这一需要,我们推出一个模拟测试平台,支持(1) 以一种叫作Scenic的概率性编程语言建立直观情景模型,(2) 具体规定一个多目标评价指标,部分优先排序,(3) 篡改所提供的衡量标准,(4) 模拟,以进行可缩放测试。作为平台的一部分,我们提供一个由25个精密程序组成的图书馆,用以模拟涉及互动式交通参与者行为的测试情景。我们通过测试一个经过培训的行为预测模型和寻找失败情景,展示了我们的平台的有效性和可扩展性。