Most existing methods for category-level pose estimation rely on object point clouds. However, when considering transparent objects, depth cameras are usually not able to capture meaningful data, resulting in point clouds with severe artifacts. Without a high-quality point cloud, existing methods are not applicable to challenging transparent objects. To tackle this problem, we present StereoPose, a novel stereo image framework for category-level object pose estimation, ideally suited for transparent objects. For a robust estimation from pure stereo images, we develop a pipeline that decouples category-level pose estimation into object size estimation, initial pose estimation, and pose refinement. StereoPose then estimates object pose based on representation in the normalized object coordinate space~(NOCS). To address the issue of image content aliasing, we further define a back-view NOCS map for the transparent object. The back-view NOCS aims to reduce the network learning ambiguity caused by content aliasing, and leverage informative cues on the back of the transparent object for more accurate pose estimation. To further improve the performance of the stereo framework, StereoPose is equipped with a parallax attention module for stereo feature fusion and an epipolar loss for improving the stereo-view consistency of network predictions. Extensive experiments on the public TOD dataset demonstrate the superiority of the proposed StereoPose framework for category-level 6D transparent object pose estimation.
翻译:用于类别级物体的现有大多数估算方法取决于对象点云层。然而,在审议透明天体时,深度摄像头通常无法捕捉有意义的数据,从而形成有严重文物的点云层。如果没有高质量的点云云,现有方法不适用于具有挑战性的透明天体。为解决这一问题,我们提出了用于类别级物体的新颖的立体图像框架SterePose,这是适合透明天体的新颖的立体图像框架。为了从纯立体图像中进行强有力的估算,我们开发了一条管道,将类别级的估算分解为对象大小估测、初步估测和完善。StereoPose然后根据在正常天体协调空间~(NOCS)中的代表性估算对象。为了解决图像内容的别名问题,我们进一步为透明天体进一步定义了NOCS映像仪的背影图。 国家立体观测中心旨在减少因内容别而导致的网络学习模糊性,并利用透明天体图背后的信息提示进行更准确的估测。为了进一步改善立框架的性,StereopePosePose,然后根据正常天体力关注空间座空间-D级,提出了一个用于对立体型系统级卫星级数据可持续性的稳定性的稳定性的定位的定位的定位,从而显示系统级的图像的定位。