Object pose estimation is a necessary prerequisite for autonomous robotic manipulation, but the presence of symmetry increases the complexity of the pose estimation task. Existing methods for object pose estimation output a single 6D pose. Thus, they lack the ability to reason about symmetries. Lately, modeling object orientation as a non-parametric probability distribution on the SO(3) manifold by neural networks has shown impressive results. However, acquiring large-scale datasets to train pose estimation models remains a bottleneck. To address this limitation, we introduce an automatic pose labeling scheme. Given RGB-D images without object pose annotations and 3D object models, we design a two-stage pipeline consisting of point cloud registration and render-and-compare validation to generate multiple symmetrical pseudo-ground-truth pose labels for each image. Using the generated pose labels, we train an ImplicitPDF model to estimate the likelihood of an orientation hypothesis given an RGB image. An efficient hierarchical sampling of the SO(3) manifold enables tractable generation of the complete set of symmetries at multiple resolutions. During inference, the most likely orientation of the target object is estimated using gradient ascent. We evaluate the proposed automatic pose labeling scheme and the ImplicitPDF model on a photorealistic dataset and the T-Less dataset, demonstrating the advantages of the proposed method.
翻译:对象的估测是自主机器人操作的必要先决条件,但存在对称性会增加构成估测任务的复杂性。对对象的现有方法显示一个6D的估测输出。 因此,它们缺乏解释对称性的能力。 最近, 将对象定向作为神经网络SO(3)多重的无参数概率分布模型, 显示了令人印象深刻的结果。 然而, 获得大型数据集来培训构成估计模型仍然是一个瓶颈。 为了应对这一限制, 我们引入了一个自动的显示标签方案。 鉴于没有对象的 RGB- D 图像包含说明和 3D 对象模型, 我们设计了两阶段管道, 由点云登记和制成和制成验证组成, 以生成多对称的假地图显示。 我们用生成的假图标签, 训练一个隐含的PDFF模型来估计定位假设假设值的可能性。 为解决这一限制, 我们引入了一个自动的SO(3) 模型, 能够在多个分辨率上生成整套的配对称组合。 在推论中, 最有可能的准方向是生成一个标值的模型, 并用一个标定的模型来估算数据。