6D pose estimation of textureless objects is a valuable but challenging task for many robotic applications. In this work, we propose a framework to address this challenge using only RGB images acquired from multiple viewpoints. The core idea of our approach is to decouple 6D pose estimation into a sequential two-step process, first estimating the 3D translation and then the 3D rotation of each object. This decoupled formulation first resolves the scale and depth ambiguities in single RGB images, and uses these estimates to accurately identify the object orientation in the second stage, which is greatly simplified with an accurate scale estimate. Moreover, to accommodate the multi-modal distribution present in rotation space, we develop an optimization scheme that explicitly handles object symmetries and counteracts measurement uncertainties. In comparison to the state-of-the-art multi-view approach, we demonstrate that the proposed approach achieves substantial improvements on a challenging 6D pose estimation dataset for textureless objects.
翻译:6D 表示无纹理天体的估算是许多机器人应用的一项宝贵但具有挑战性的任务。 在这项工作中,我们建议了一个框架来应对这一挑战,仅使用从多重角度获得的 RGB 图像。 我们的方法的核心思想是将 6D 表示的估算分为一个连续的两步过程,先是估算每个天体的三维翻译,然后是三维旋转。 这种脱钩的配方首先解决单一 RGB 图像中的规模和深度的模糊性, 并使用这些估算来准确确定第二阶段的物体方向, 第二阶段的定位会大大简化, 并进行精确的标度估计。 此外, 为了适应在旋转空间中存在的多模式分布, 我们制定了一个优化计划, 明确处理天体的对称和测量不确定性。 与最先进的多视角方法相比, 我们证明拟议的方法在挑战性的 6D 提出无纹理天体的天体的估算数据集方面取得了重大改进。