Neural scene representations, such as neural radiance fields (NeRF), are based on training a multilayer perceptron (MLP) using a set of color images with known poses. An increasing number of devices now produce RGB-D information, which has been shown to be very important for a wide range of tasks. Therefore, the aim of this paper is to investigate what improvements can be made to these promising implicit representations by incorporating depth information with the color images. In particular, the recently proposed Mip-NeRF approach, which uses conical frustums instead of rays for volume rendering, allows one to account for the varying area of a pixel with distance from the camera center. The proposed method additionally models depth uncertainty. This allows to address major limitations of NeRF-based approaches including improving the accuracy of geometry, reduced artifacts, faster training time, and shortened prediction time. Experiments are performed on well-known benchmark scenes, and comparisons show improved accuracy in scene geometry and photometric reconstruction, while reducing the training time by 3 - 5 times.
翻译:神经场景表征,如神经弧度场(NERF),是基于使用一组已知外形的彩色图像对多层光谱显示器(MLP)进行培训。越来越多的设备现在产生RGB-D信息,这已证明对一系列广泛的任务非常重要。因此,本文件的目的是调查通过将深度信息与彩色图像结合起来,这些有希望的隐含表征可以作出哪些改进。特别是,最近提出的Mip-NERF方法使用锥形的光谱而不是射线进行体积成像,使得人们可以对与摄影中心相距遥远的像素的不同区域进行计算。拟议方法的深度不确定性是额外的。这可以解决NERF方法的主要局限性,包括提高地理测量的准确性、减少文物、加快培训时间和缩短预测时间。实验是在广为人知的基准场进行,比较表明现场几何和光度重建的准确性提高了,同时将培训时间减少3-5次。