In this paper, we propose a novel 3D graph convolution based pipeline for category-level 6D pose and size estimation from monocular RGB-D images. The proposed method leverages an efficient 3D data augmentation and a novel vector-based decoupled rotation representation. Specifically, we first design an orientation-aware autoencoder with 3D graph convolution for latent feature learning. The learned latent feature is insensitive to point shift and size thanks to the shift and scale-invariance properties of the 3D graph convolution. Then, to efficiently decode the rotation information from the latent feature, we design a novel flexible vector-based decomposable rotation representation that employs two decoders to complementarily access the rotation information. The proposed rotation representation has two major advantages: 1) decoupled characteristic that makes the rotation estimation easier; 2) flexible length and rotated angle of the vectors allow us to find a more suitable vector representation for specific pose estimation task. Finally, we propose a 3D deformation mechanism to increase the generalization ability of the pipeline. Extensive experiments show that the proposed pipeline achieves state-of-the-art performance on category-level tasks. Further, the experiments demonstrate that the proposed rotation representation is more suitable for the pose estimation tasks than other rotation representations.
翻译:在本文中,我们提出一个新的 3D 图形化6D 配置和尺寸估计管道的3D 图形化配置,用于 6D 配置和以单向 RGB-D 图像进行尺寸估计。拟议方法利用了高效的 3D 数据扩增和基于病媒的分解旋转代表制。具体地说,我们首先设计了方向-觉变自动编码器和3D 图形化组合,用于潜在的特征学习。由于 3D 图形化的变换和规模变化特性,所学的潜伏特征对于点变换和大小不敏感。然后,为了有效地将旋转信息从潜在特征中解码,我们设计了一个新的灵活的基于病媒的可复变轮换代表制,使用两个解码器来补充轮换信息。拟议的轮换代表制有两个主要优势:(1) 使旋转估计更容易被拆分解的特性;(2) 灵活的矢量和旋转角度使我们能找到更合适的矢量代表制代表制,用于具体的估测度任务。最后,我们提议了一个3D 变变变机制,以提高管道的普及能力。广泛的实验显示,拟议的管道实现基于状态的分解式调整的状态,而不是在类别一级上进行适当的演化,进一步显示其他任务。