In this paper, we introduce neural texture learning for 6D object pose estimation from synthetic data and a few unlabelled real images. Our major contribution is a novel learning scheme which removes the drawbacks of previous works, namely the strong dependency on co-modalities or additional refinement. These have been previously necessary to provide training signals for convergence. We formulate such a scheme as two sub-optimisation problems on texture learning and pose learning. We separately learn to predict realistic texture of objects from real image collections and learn pose estimation from pixel-perfect synthetic data. Combining these two capabilities allows then to synthesise photorealistic novel views to supervise the pose estimator with accurate geometry. To alleviate pose noise and segmentation imperfection present during the texture learning phase, we propose a surfel-based adversarial training loss together with texture regularisation from synthetic data. We demonstrate that the proposed approach significantly outperforms the recent state-of-the-art methods without ground-truth pose annotations and demonstrates substantial generalisation improvements towards unseen scenes. Remarkably, our scheme improves the adopted pose estimators substantially even when initialised with much inferior performance.
翻译:在本文中,我们为 6D 对象引入神经质素学习, 以合成数据和一些未贴标签的真实图像为估计值。我们的主要贡献是一个创新的学习计划,消除了以前作品的缺点,即强烈依赖共同模式或进一步的改进。这些以前对于提供培训信号促进趋同是必要的。我们将这种计划作为关于纹理学习和成文的两个次级优化问题来制定。我们分别学习从真实图像收藏中预测物体的现实质素,并从像素过敏合成数据中了解构成估计值。把这两种能力结合起来,就能够对具有精确几何特征的图像真实性新观点进行合成。为了减轻在纹理学习阶段出现的噪音和偏差现象,我们提议在基于表面的对抗性培训损失的同时,将合成数据的纹理正规化。我们证明,拟议的方法大大地超越了最新最先进的方法,而没有地面图解,并展示了对不为人所知的场景的彻底的全局性改进。值得注意的是,我们的方案改进了所采用的低等化性表现。</s>