Safe deployment of self-driving cars (SDC) necessitates thorough simulated and in-field testing. Most testing techniques consider virtualized SDCs within a simulation environment, whereas less effort has been directed towards assessing whether such techniques transfer to and are effective with a physical real-world vehicle. In this paper, we shed light on the problem of generalizing testing results obtained in a driving simulator to a physical platform and provide a characterization and quantification of the sim2real gap affecting SDC testing. In our empirical study, we compare SDC testing when deployed on a physical small-scale vehicle vs its digital twin. Due to the unavailability of driving quality indicators from the physical platform, we use neural rendering to estimate them through visual odometry, hence allowing full comparability with the digital twin. Then, we investigate the transferability of behavior and failure exposure between virtual and real-world environments, targeting both unintended abnormal test data and intended adversarial examples. Our study shows that, despite the usage of a faithful digital twin, there are still critical shortcomings that contribute to the reality gap between the virtual and physical world, threatening existing testing solutions that only consider virtual SDCs. On the positive side, our results present the test configurations for which physical testing can be avoided, either because their outcome does transfer between virtual and physical environments, or because the uncertainty profiles in the simulator can help predict their outcome in the real world.
翻译:安全部署自驾驶汽车(SDC)需要彻底的模拟和实地测试。大多数测试技术都考虑模拟环境中的虚拟SDC,而较少的努力用于评估这些技术是否向实体真实世界飞行器转移,是否与实体现实世界飞行器有效。在本文中,我们揭示了将驱动模拟器到物理平台的测试结果普遍化的问题,并对影响SDC测试的模拟差距进行了定性和量化。在我们的实证研究中,我们比较了在实际小型车辆上部署的SDC测试与数字双人之间的现实差距。由于物理平台无法提供驱动质量指标,我们使用神经显示来评估这些技术是否向物理现实世界飞行器转移,是否与数字世界飞行器充分兼容。然后,我们调查虚拟世界环境与现实世界环境中的行为和故障暴露的可转移性,针对意外异常测试数据和预期的对抗性实例。我们的研究显示,尽管使用了可靠的数字双胞体,但是仍然存在着一些关键缺陷,导致虚拟世界与实体世界之间的现实差距,威胁现有的测试解决方案,而现有的测试方法只能通过视觉观测,因为实际结果的测试环境能够避免实际结果。