Recent advances in neural radiance fields (NeRFs) achieve state-of-the-art novel view synthesis and facilitate dense estimation of scene properties. However, NeRFs often fail for large, unbounded scenes that are captured under very sparse views with the scene content concentrated far away from the camera, as is typical for field robotics applications. In particular, NeRF-style algorithms perform poorly: (1) when there are insufficient views with little pose diversity, (2) when scenes contain saturation and shadows, and (3) when finely sampling large unbounded scenes with fine structures becomes computationally intensive. This paper proposes CLONeR, which significantly improves upon NeRF by allowing it to model large outdoor driving scenes that are observed from sparse input sensor views. This is achieved by decoupling occupancy and color learning within the NeRF framework into separate Multi-Layer Perceptrons (MLPs) trained using LiDAR and camera data, respectively. In addition, this paper proposes a novel method to build differentiable 3D Occupancy Grid Maps (OGM) alongside the NeRF model, and leverage this occupancy grid for improved sampling of points along a ray for volumetric rendering in metric space. Through extensive quantitative and qualitative experiments on scenes from the KITTI dataset, this paper demonstrates that the proposed method outperforms state-of-the-art NeRF models on both novel view synthesis and dense depth prediction tasks when trained on sparse input data.
翻译:神经光亮场(NERFs)最近的进展达到最新水平的新视角合成,便于对场景属性进行密集估计。然而,NERFs通常无法满足大型、无边场景,这些场景是在非常稀少的视野下拍摄的,现场内容集中,远离摄像头,这是典型的。特别是,NERF式算法表现不佳:(1) 没有足够的观点,几乎没有面貌多样性,(2) 场景含有饱和和和阴影,(3) 微量抽样大片未入场面,结构精细度则在计算上变得密集。本文提议CLONER(CLONER),通过让NERF能够建模大型室外驾驶场景,这些场景在非常稀少的输入感应器观察下拍摄,在NERF框架内将占用时间和彩色学习分为不同的多轮 Perceprons(MLLPs),分别使用LiDAR和相机数据培训。此外,本文还提出一种新的方法,在NERF模型上建立不同的3D Occupation Gret 地图(OMM) 模型,在经过广泛测试后,利用这个空间阵列的深度模型,在质量实验中改进了KI-RODRismal-rol-rogradualal 上改进了该模型,从而改进了K-tradroal 。