The recent progress in implicit 3D representation, i.e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner. This new representation can effectively convey the information of hundreds of high-resolution images in one compact format and allows photorealistic synthesis of novel views. In this work, using the variant of NeRF called Plenoxels, we create the first large-scale implicit representation datasets for perception tasks, called the PeRFception, which consists of two parts that incorporate both object-centric and scene-centric scans for classification and segmentation. It shows a significant memory compression rate (96.4\%) from the original dataset, while containing both 2D and 3D information in a unified form. We construct the classification and segmentation models that directly take as input this implicit format and also propose a novel augmentation technique to avoid overfitting on backgrounds of images. The code and data are publicly available in https://postech-cvlab.github.io/PeRFception .
翻译:最近隐含的 3D 表示方式,即神经辐射场(NeRFs) 的进展使得以不同的方式进行精确和摄影现实的 3D 重建成为可能。 这种新的表示方式可以以一种紧凑的格式有效地传送数百个高分辨率图像的信息,并允许对新观点进行光现实化的合成。 在这项工作中,我们使用名为 Plenoxels 的 NERF 的变体,为感知任务创建了第一个大规模隐含的表示数据集,称为 PeRFception, 由两部分组成, 其中包括用于分类和分割的物体中心扫描和以景点为中心的扫描。 它显示了从原始数据集(96.4 4 ⁇ ) 中得出的大量记忆压缩率(96.4 ⁇ ), 同时以统一的形式包含 2D 和 3D 信息。 我们构建了直接将这种隐含格式作为输入内容的分类和分解模式, 并提议一种新的增强技术, 以避免过度配置图像背景。 代码和数据可在 https://postech-cvlab.github.io/PerFeption 上公开查阅。