We propose a method for estimating the 6DoF pose of a rigid object with an available 3D model from a single RGB image. Unlike classical correspondence-based methods which predict 3D object coordinates at pixels of the input image, the proposed method predicts 3D object coordinates at 3D query points sampled in the camera frustum. The move from pixels to 3D points, which is inspired by recent PIFu-style methods for 3D reconstruction, enables reasoning about the whole object, including its (self-)occluded parts. For a 3D query point associated with a pixel-aligned image feature, we train a fully-connected neural network to predict: (i) the corresponding 3D object coordinates, and (ii) the signed distance to the object surface, with the first defined only for query points in the surface vicinity. We call the mapping realized by this network as Neural Correspondence Field. The object pose is then robustly estimated from the predicted 3D-3D correspondences by the Kabsch-RANSAC algorithm. The proposed method achieves state-of-the-art results on three BOP datasets and is shown superior especially in challenging cases with occlusion. The project website is at: linhuang17.github.io/NCF.
翻译:我们提出一种方法来估计6DoF 的硬性对象形状, 其3D 模型来自一个 RGB 图像。 与预测输入图像像素像素上的 3D 对象坐标的经典对应法不同, 拟议的方法预测3D 对象坐标, 是在3D 查询点的3D 目标坐标, 取样在摄像头中取样。 由最近的 PIFu 式的3D 重建方法所启发, 从像素移动到 3D 点, 能够推理整个对象, 包括其( 自我) 隐蔽的部件。 对于与像素调合图像特征相关的 3D 查询点, 我们训练一个完全连接的神经网络来预测:( 一) 对应的 3D 对象坐标, 和 (二) 与对象表面的3D 查询点的签名距离。 我们把这个网络所实现的映射图称为 NIF- CORD 字段。 然后根据 Kabsch-RANSAC 算法预测的 3D 3D 进行精确估计。 方法在三个 BOPlusetclus 网站上显示有挑战性的数据/ acreaubclus 和高级 。 。