Semantic occupancy perception is essential for autonomous driving, as automated vehicles require a fine-grained perception of the 3D urban structures. However, existing relevant benchmarks lack diversity in urban scenes, and they only evaluate front-view predictions. Towards a comprehensive benchmarking of surrounding perception algorithms, we propose OpenOccupancy, which is the first surrounding semantic occupancy perception benchmark. In the OpenOccupancy benchmark, we extend the large-scale nuScenes dataset with dense semantic occupancy annotations. Previous annotations rely on LiDAR points superimposition, where some occupancy labels are missed due to sparse LiDAR channels. To mitigate the problem, we introduce the Augmenting And Purifying (AAP) pipeline to ~2x densify the annotations, where ~4000 human hours are involved in the labeling process. Besides, camera-based, LiDAR-based and multi-modal baselines are established for the OpenOccupancy benchmark. Furthermore, considering the complexity of surrounding occupancy perception lies in the computational burden of high-resolution 3D predictions, we propose the Cascade Occupancy Network (CONet) to refine the coarse prediction, which relatively enhances the performance by ~30% than the baseline. We hope the OpenOccupancy benchmark will boost the development of surrounding occupancy perception algorithms.
翻译:由于自动化车辆要求对3D型城市结构有精细的认知,因此,自闭式占用感对于自主驾驶至关重要。然而,现有的相关基准缺乏城市场景的多样性,而只是对前视预测进行评价。为了对周围的感知算法进行全面基准,我们提议采用OpenOccupacy,这是围绕自闭式占用感的基准。在开放性占用率基准中,我们扩大了大型努Scenes数据集,并配有密集的语系占用性说明。前一个说明依赖于LIDAR点的叠加定位,因为一些占用性标签因缺少LIDAR频道而丢失。为了缓解问题,我们引入扩大和净化管道(AAP)到~2x放大说明,其中约4000人小时参与标签过程。此外,我们为开放性占用性基准建立了基于摄像的、基于LIDAR的和多模式的基线。此外,考虑到周围占用性概念的复杂性在于高分辨率 3D 预测的计算负担中。为了缓解问题,我们建议采用升级和净化(Cascadevelopal) 递增缩式轨道定位网络,这将通过开放性基准改进“Oview” 。</s>