In this paper, we focus on category-level 6D pose and size estimation from monocular RGB-D image. Previous methods suffer from inefficient category-level pose feature extraction which leads to low accuracy and inference speed. To tackle this problem, we propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. First, we design an orientation aware autoencoder with 3D graph convolution for latent feature extraction. The learned latent feature is insensitive to point shift and object size thanks to the shift and scale-invariance properties of the 3D graph convolution. Then, to efficiently decode category-level rotation information from the latent feature, we propose a novel decoupled rotation mechanism that employs two decoders to complementarily access the rotation information. Meanwhile, we estimate translation and size by two residuals, which are the difference between the mean of object points and ground truth translation, and the difference between the mean size of the category and ground truth size, respectively. Finally, to increase the generalization ability of FS-Net, we propose an online box-cage based 3D deformation mechanism to augment the training data. Extensive experiments on two benchmark datasets show that the proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation. Especially in category-level pose estimation, without extra synthetic data, our method outperforms existing methods by 6.3% on the NOCS-REAL dataset.
翻译:在本文中,我们侧重于6D类的外观和单质 RGB-D 图像的大小估计。 以往的方法在类别一级效率低下的外观提取中存在特征特征,导致精确度和推断速度低。 为了解决这一问题,我们提议建立一个基于快速形状的网络(FS-Net),为6D 的外观提取提供高效的类别级特征提取; 首先,我们设计一个有3D 图形图解变异用于潜在地貌提取的自定义编码器和尺寸。 由于3D 图解剖的变换和规模变异性,所学的潜潜伏特征对点变换和对象大小不敏感。 然后,为了有效地解译类别级的递解类别一级递解信息,我们提议了一个新型的拆分解轮换机制,使用两个解型网络来补充轮调信息。 与此同时,我们估计了两个剩余部分的翻译和大小,即对象点和地面真象翻译的平均值,以及类别和地面真象大小之间的差。 最后,为了提高FS- Net 的通用能力,我们提议一个基于3D类的正值级的外向目标级的外向级的外观实验, 级的内基于3D类的外观的外观实验, 级的外向级的外观数据测试, 级的外观的外观数据级的外观, 显示级的外观,以显示的外观的外观的外观数据, 显示的外观数据, 级的外观数据, 级的外观的数据级的外观, 级数据, 级的外观测算法系。