Coordinate-based implicit neural networks, or neural fields, have emerged as useful representations of shape and appearance in 3D computer vision. Despite advances however, it remains challenging to build neural fields for categories of objects without datasets like ShapeNet that provide canonicalized object instances that are consistently aligned for their 3D position and orientation (pose). We present Canonical Field Network (CaFi-Net), a self-supervised method to canonicalize the 3D pose of instances from an object category represented as neural fields, specifically neural radiance fields (NeRFs). CaFi-Net directly learns from continuous and noisy radiance fields using a Siamese network architecture that is designed to extract equivariant field features for category-level canonicalization. During inference, our method takes pre-trained neural radiance fields of novel object instances at arbitrary 3D pose, and estimates a canonical field with consistent 3D pose across the entire category. Extensive experiments on a new dataset of 1300 NeRF models across 13 object categories show that our method matches or exceeds the performance of 3D point cloud-based methods.
翻译:以坐标为基础的内隐神经网络,或神经场,在3D计算机视野中,已成为形状和外观的有用表示。尽管取得了一些进步,但仍难以为没有数据集的物体类别建立神经场,例如,ShampeNet,提供与3D位置和方向一致的圆形物体实例。我们展示了CaFi-Net,这是一种自我监督的方法,可以将作为神经场,特别是神经光场(NERFs)代表的物体类别的3D成像成像成像。CAFi-Net直接学习连续和吵闹的辐射场,使用Siamse网络结构,设计该结构是为了为分类的圆形圆形形状提取等异形的场特性。在推断中,我们的方法采用了任意3D成形新物体的预先训练的神经亮度场,并估计了一个在整个类别中具有3D成形的圆形场。对13NERF模型的1300新数据集进行了广泛的实验,显示我们的方法与3D点云基方法相匹配或超过3D的性。