We present iNeRF, a framework that performs mesh-free pose estimation by "inverting" a Neural RadianceField (NeRF). NeRFs have been shown to be remarkably effective for the task of view synthesis - synthesizing photorealistic novel views of real-world scenes or objects. In this work, we investigate whether we can apply analysis-by-synthesis via NeRF for mesh-free, RGB-only 6DoF pose estimation - given an image, find the translation and rotation of a camera relative to a 3D object or scene. Our method assumes that no object mesh models are available during either training or test time. Starting from an initial pose estimate, we use gradient descent to minimize the residual between pixels rendered from a NeRF and pixels in an observed image. In our experiments, we first study 1) how to sample rays during pose refinement for iNeRF to collect informative gradients and 2) how different batch sizes of rays affect iNeRF on a synthetic dataset. We then show that for complex real-world scenes from the LLFF dataset, iNeRF can improve NeRF by estimating the camera poses of novel images and using these images as additional training data for NeRF. Finally, we show iNeRF can perform category-level object pose estimation, including object instances not seen during training, with RGB images by inverting a NeRF model inferred from a single view.
翻译:我们推出 iNeRF, 这个框架通过“ 翻转” 一个神经辐射场( NeRF ) 进行无网状的外观估计。 NeRF 被证明对观景合成任务非常有效, 合成了真实世界的场景或对象的光现实新观点。 在这项工作中, 我们调查我们是否可以通过 NERF 进行无网状、 RGB 仅为 6DoF 的无网状图像分析, 显示一个图像, 找到与 3D 对象或场景相对的相机的翻译和旋转。 我们的方法假设, 无论是在训练或测试时间, 都没有对象网状网形模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型。 我们从最初的组合模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型模型显示, 从初始的图像组开始, 我们使用梯度梯度的梯度梯度梯度下降级缩缩缩图示图示图像, 最后用NRF IMF 显示我们用新图则显示的模型的模型的模型的模型进行模型的模型模型模型模型的模型模型的模型的模型的模型, 。