Recently, RGBD-based category-level 6D object pose estimation has achieved promising improvement in performance, however, the requirement of depth information prohibits broader applications. In order to relieve this problem, this paper proposes a novel approach named Object Level Depth reconstruction Network (OLD-Net) taking only RGB images as input for category-level 6D object pose estimation. We propose to directly predict object-level depth from a monocular RGB image by deforming the category-level shape prior into object-level depth and the canonical NOCS representation. Two novel modules named Normalized Global Position Hints (NGPH) and Shape-aware Decoupled Depth Reconstruction (SDDR) module are introduced to learn high fidelity object-level depth and delicate shape representations. At last, the 6D object pose is solved by aligning the predicted canonical representation with the back-projected object-level depth. Extensive experiments on the challenging CAMERA25 and REAL275 datasets indicate that our model, though simple, achieves state-of-the-art performance.
翻译:最近,基于RGBD类别6D对象的6D对象估计在性能方面取得了大有希望的改善,然而,深度信息的要求禁止更广泛的应用;为缓解这一问题,本文件提议采用名为“物体水平深度重建网络”的新颖方法,仅将RGB图像作为6D对象类别6D对象的估计投入。我们提议通过将分类形状变形为目标水平深度和Cancial NOCS表示法,直接从单一的RGB图像中预测目标水平深度。两个名为“全球标准位置”和“Shape-awaredcouped 深度重建”的新型模块被引入学习高忠诚度目标水平深度和微妙形状表示法。最后,通过将预测的CAMERA25和MelE275数据集的大规模实验,我们的模式虽然简单,但能够达到最先进的性能。