We consider an object pose estimation and model fitting problem, where - given a partial point cloud of an object - the goal is to estimate the object pose by fitting a CAD model to the sensor data. We solve this problem by combining (i) a semantic keypoint-based pose estimation model, (ii) a novel self-supervised training approach, and (iii) a certification procedure, that not only verifies whether the output produced by the model is correct or not, but also flags uniqueness of the produced solution. The semantic keypoint detector model is initially trained in simulation and does not perform well on real-data due to the domain gap. Our self-supervised training procedure uses a corrector and a certification module to improve the detector. The corrector module corrects the detected keypoints to compensate for the domain gap, and is implemented as a declarative layer, for which we develop a simple differentiation rule. The certification module declares whether the corrected output produced by the model is certifiable (i.e. correct) or not. At each iteration, the approach optimizes over the loss induced only by the certifiable input-output pairs. As training progresses, we see that the fraction of outputs that are certifiable increases, eventually reaching near $100\%$ in many cases. We also introduce the notion of strong certifiability wherein the model can determine if the predicted object model fit is unique or not. The detected semantic keypoints help us implement this in the forward pass. We conduct extensive experiments to evaluate the performance of the corrector, the certification, and the proposed self-supervised training using the ShapeNet and YCB datasets, and show the proposed approach achieves performance comparable to fully supervised baselines while not requiring pose or keypoint supervision on real data.
翻译:我们认为一个对象构成估计和模型安装问题,如果给一个对象的部分点云层,目标是通过将一个 CAD 模型安装到传感器数据中来估计对象构成情况。我们通过以下方法解决这个问题:(一) 基于语义关键点的图像估算模型,(二) 一种新的自我监督培训方法,以及(三) 认证程序,该程序不仅核查模型产生的输出是否正确,而且标出所产生解决方案的独特性。语义关键点检测器模型最初在模拟方面受过培训,由于域差而不能在真实目标数据上运行良好。我们自我监督的培训程序使用一个校正和认证模块来改进检测器。正确的模块纠正所检测到的关键点,以弥补域间差距,并作为宣示层执行,为此我们制定简单的区分规则。认证模块宣布模型产生的校正产出是否可以验证(i.e.正确),或者没有显示。每次测试,由于域域差差差差,我们自我监督的方法优化了培训损失的准确性能,而我们只能通过直观的直径测试结果,我们只能看到模型的精度,我们的精度的精度测试结果。