Scalable sensor simulation is an important yet challenging open problem for safety-critical domains such as self-driving. Current work in image simulation either fail to be photorealistic or do not model the 3D environment and the dynamic objects within, losing high-level control and physical realism. In this paper, we present GeoSim, a geometry-aware image composition process that synthesizes novel urban driving scenes by augmenting existing images with dynamic objects extracted from other scenes and rendered at novel poses. Towards this goal, we first build a diverse bank of 3D objects with both realistic geometry and appearance from sensor data. During simulation, we perform a novel geometry-aware simulation-by-composition procedure which 1) proposes plausible and realistic object placements into a given scene, 2) renders novel views of dynamic objects from the asset bank, and 3) composes and blends the rendered image segments. The resulting synthetic images are photorealistic, traffic-aware, and geometrically consistent, allowing image simulation to scale to complex use cases. We demonstrate two such important applications: long-range realistic video simulation across multiple camera sensors, and synthetic data generation for data augmentation on downstream segmentation tasks.
翻译:对于自我驾驶等安全关键领域来说,可缩放的传感器模拟是一个重要而又具有挑战性的问题,对于自我驾驶等安全关键领域来说,这是一个重要而又具有挑战性的开放问题。目前图像模拟工作不是没有光现实化的,就是没有模拟3D环境以及其中的动态物体,失去了高度控制和物理现实性。在本文中,我们介绍了GeoSim,这是一个几何-认知图像合成过程,它通过增加从其他场景提取的动态物体来合成新的城市驱动场景。为了实现这一目标,我们首先建立了一个由3D对象组成的多样化库,既具有现实几何特征,也具有感官数据的外观。在模拟过程中,我们执行了一个新型的几何对称模拟逐位模拟程序,其中1)提出将合理和现实的物体放置在特定场景中,2)对资产库中的动态物体提出了新的观点,3)对成形图像段进行了折叠和混合。由此形成的合成图像是光现实的、交通觉察力和几何一致的,使得图像模拟能够缩成复杂的使用案例。我们展示了两个如此重要的应用:在多个摄像传感器传感器传感器上远距离进行真实的图像模拟和合成数据生成。我们展示了两个重要应用。