It is difficult to precisely annotate object instances and their semantics in 3D space, and as such, synthetic data are extensively used for these tasks, e.g., category-level 6D object pose and size estimation. However, the easy annotations in synthetic domains bring the downside effect of synthetic-to-real (Sim2Real) domain gap. In this work, we aim to address this issue in the task setting of Sim2Real, unsupervised domain adaptation for category-level 6D object pose and size estimation. We propose a method that is built upon a novel Deep Prior Deformation Network, shortened as DPDN. DPDN learns to deform features of categorical shape priors to match those of object observations, and is thus able to establish deep correspondence in the feature space for direct regression of object poses and sizes. To reduce the Sim2Real domain gap, we formulate a novel self-supervised objective upon DPDN via consistency learning; more specifically, we apply two rigid transformations to each object observation in parallel, and feed them into DPDN respectively to yield dual sets of predictions; on top of the parallel learning, an inter-consistency term is employed to keep cross consistency between dual predictions for improving the sensitivity of DPDN to pose changes, while individual intra-consistency ones are used to enforce self-adaptation within each learning itself. We train DPDN on both training sets of the synthetic CAMERA25 and real-world REAL275 datasets; our results outperform the existing methods on REAL275 test set under both the unsupervised and supervised settings. Ablation studies also verify the efficacy of our designs. Our code is released publicly at https://github.com/JiehongLin/Self-DPDN.
翻译:很难准确地说明3D空间的物体实例及其语义,因此,合成数据被广泛用于这些任务,例如,6D类天体构成和大小估计。然而,合成域的简单说明带来了合成到现实(Sim2Real)域间差距的下行效应。在这项工作中,我们的目标是在Sim2Real的任务设置中解决这一问题,对6D类天体构成和大小估计进行不受监督的域域适应。我们提议了一种方法,该方法建在新型的深层前变形网络上,随着DPDN的缩短。DPD75N学会在与对象观测的相匹配之前直观形状的变形特征,从而在特性空间中建立对合成到真实(Sim2Realalal)域间差距的深刻对应效应。为了减少Sim2Real域差距,我们通过一致性学习,在DPDN上设计了一个全新的自我超导目标;更具体地对每个天体的天体变码进行两次僵硬的变换,并将它们分别输入到DDDDN的双向内部预测中进行双向内部变。