Real-world robotics applications demand object pose estimation methods that work reliably across a variety of scenarios. Modern learning-based approaches require large labeled datasets and tend to perform poorly outside the training domain. Our first contribution is to develop a robust corrector module that corrects pose estimates using depth information, thus enabling existing methods to better generalize to new test domains; the corrector operates on semantic keypoints (but is also applicable to other pose estimators) and is fully differentiable. Our second contribution is an ensemble self-training approach that simultaneously trains multiple pose estimators in a self-supervised manner. Our ensemble self-training architecture uses the robust corrector to refine the output of each pose estimator; then, it evaluates the quality of the outputs using observable correctness certificates; finally, it uses the observably correct outputs for further training, without requiring external supervision. As an additional contribution, we propose small improvements to a regression-based keypoint detection architecture, to enhance its robustness to outliers; these improvements include a robust pooling scheme and a robust centroid computation. Experiments on the YCBV and TLESS datasets show the proposed ensemble self-training outperforms fully supervised baselines while not requiring 3D annotations on real data.
翻译:现实世界机器人应用需求要求的天体代表了在各种情景下可靠运作的估算方法。现代学习方法需要大型标签数据集,而且往往在培训领域之外表现不佳。我们的第一个贡献是开发一个强大的校正器模块,利用深度信息对估计数进行校正,从而使现有方法能够更好地概括到新的测试领域;校正器在语义关键点上运行(但也适用于其他成形估测器),并且完全可以区分。我们的第二个贡献是共同的自我培训方法,以自我监督的方式同时培训多重姿态估测器。我们的混合自培训结构使用强健的校正器来改进每个成形估测器的输出;然后,它利用可观察的校正证书评估产出的质量;最后,它使用难看的正确产出进行进一步培训,而无需外部监督。我们提议对基于回归的关键点检测结构作小的改进,以提高其对外围值的坚固性;这些改进包括一个坚固的合并计划,以及一个坚固的百分数的自我培训结构结构,同时对YCBA和SIS数据库进行彻底的测试,同时要求对YCBBA和BAR制数据进行自我分析。