Geometric feature extraction is a crucial component of point cloud registration pipelines. Recent work has demonstrated how supervised learning can be leveraged to learn better and more compact 3D features. However, those approaches' reliance on ground-truth annotation limits their scalability. We propose BYOC: a self-supervised approach that learns visual and geometric features from RGB-D video without relying on ground-truth pose or correspondence. Our key observation is that randomly-initialized CNNs readily provide us with good correspondences; allowing us to bootstrap the learning of both visual and geometric features. Our approach combines classic ideas from point cloud registration with more recent representation learning approaches. We evaluate our approach on indoor scene datasets and find that our method outperforms traditional and learned descriptors, while being competitive with current state-of-the-art supervised approaches.
翻译:几何特征的提取是点云登记管道的一个重要组成部分。 最近的工作表明,如何利用监督学习来学习更好、更紧凑的三维特征。 但是,这些方法依赖地面真实性说明限制了它们的可缩放性。 我们建议BYOC:一种自我监督的方法,从 RGB-D 视频中学习视觉和几何特征,而不必依靠地面真实性或通信。 我们的关键观察是随机初始的CNN 能够为我们提供良好的通信; 使我们能够将视觉和几何特征的学习捆绑起来。 我们的方法将点云登记中的经典观点与最近的演示学习方法结合起来。 我们评估了我们关于室内场景数据集的方法,发现我们的方法超越了传统和学得的描述符,同时与当前最先进的监管方法具有竞争力。