Object pose estimation from a single RGB image is a challenging problem due to variable lighting conditions and viewpoint changes. The most accurate pose estimation networks implement pose refinement via reprojection of a known, textured 3D model, however, such methods cannot be applied without high quality 3D models of the observed objects. In this work we propose an approach, namely an Innovation CNN, to object pose estimation refinement that overcomes the requirement for reprojecting a textured 3D model. Our approach improves initial pose estimation progressively by applying the Innovation CNN iteratively in a stochastic gradient descent (SGD) framework. We evaluate our method on the popular LINEMOD and Occlusion LINEMOD datasets and obtain state-of-the-art performance on both datasets.
翻译:由于灯光条件和观点变化各异,从单一的RGB图像上对对象进行估计是一个具有挑战性的问题。最准确的3D模型通过重新预测已知的、纹理的3D模型而使估计网络得到改进。然而,如果没有观测对象的高质量3D模型,这些方法就无法应用。在这项工作中,我们提议了一个方法,即创新CNN, 即一个有线电视新闻网,以针对对重新预测纹理的3D模型的要求进行估计。我们的方法通过将创新CNN反复应用在随机梯度下沉(SGD)框架中,逐步改进初步的预测。我们评估了我们流行的LINEMOD和Oclusion LINEMOD数据集的方法,并在两个数据集上都取得了最先进的性能。