This paper presents the first significant object detection framework, NeRF-RPN, which directly operates on NeRF. Given a pre-trained NeRF model, NeRF-RPN aims to detect all bounding boxes of objects in a scene. By exploiting a novel voxel representation that incorporates multi-scale 3D neural volumetric features, we demonstrate it is possible to regress the 3D bounding boxes of objects in NeRF directly without rendering the NeRF at any viewpoint. NeRF-RPN is a general framework and can be applied to detect objects without class labels. We experimented NeRF-RPN with various backbone architectures, RPN head designs and loss functions. All of them can be trained in an end-to-end manner to estimate high quality 3D bounding boxes. To facilitate future research in object detection for NeRF, we built a new benchmark dataset which consists of both synthetic and real-world data with careful labeling and clean up. Code and dataset are available at https://github.com/lyclyc52/NeRF_RPN.
翻译:本文介绍了第一个重要的物体检测框架,NeRF-RPN,它直接在NeRF上运行。给定预训练的NeRF模型,NeRF-RPN旨在检测场景中所有物体的边界框。通过利用一种新颖的体素表示方式,该表示方式融合了多尺度的3D神经体积特征,我们证明了在不渲染NeRF任何视角的情况下,可以直接回归NeRF中的物体的3D边界框。NeRF-RPN是一个通用的框架,可以应用于不带类标签的物体检测。我们通过尝试不同的骨干架构,RPN头设计和损失函数使NeRF-RPN实验得到了升华。它们都可以在端到端训练的方式下估计高质量的3D边界框。为了促进将来在NeRF中进行的物体检测研究,我们构建了一个新的基准数据集,该数据集由各种带有仔细标注和清除的合成和实际数据组成。代码和数据集可在https://github.com/lyclyc52/NeRF_RPN上找到。