We consider a certifiable object pose estimation problem, where -- given a partial point cloud of an object -- the goal is to not only estimate the object pose, but also to provide a certificate of correctness for the resulting estimate. Our first contribution is a general theory of certification for end-to-end perception models. In particular, we introduce the notion of $\zeta$-correctness, which bounds the distance between an estimate and the ground truth. We show that $\zeta$-correctness can be assessed by implementing two certificates: (i) a certificate of observable correctness, that asserts if the model output is consistent with the input data and prior information, (ii) a certificate of non-degeneracy, that asserts whether the input data is sufficient to compute a unique estimate. Our second contribution is to apply this theory and design a new learning-based certifiable pose estimator. We propose C-3PO, a semantic-keypoint-based pose estimation model, augmented with the two certificates, to solve the certifiable pose estimation problem. C-3PO also includes a keypoint corrector, implemented as a differentiable optimization layer, that can correct large detection errors (e.g. due to the sim-to-real gap). Our third contribution is a novel self-supervised training approach that uses our certificate of observable correctness to provide the supervisory signal to C-3PO during training. In it, the model trains only on the observably correct input-output pairs, in each training iteration. As training progresses, we see that the observably correct input-output pairs grow, eventually reaching near 100% in many cases. Our experiments show that (i) standard semantic-keypoint-based methods outperform more recent alternatives, (ii) C-3PO further improves performance and significantly outperforms all the baselines, and (iii) C-3PO's certificates are able to discern correct pose estimates.
翻译:我们认为,一个可以验证的天体会构成估算问题,在这样一个天体中,如果一个天体有部分点云层,那么目标不仅要估计对象的构成,而且要为由此得出的估计提供正确性证明。我们的第一个贡献是对端对端的认知模型进行验证的一般理论。特别是,我们引入了美元对端的校正性概念,它将估计数与地面事实的距离相连接。我们显示,通过执行两个证书,可以评估美元对齐值的校正性能: (一) 一个可观测的正确性能证书,如果模型输出与输入数据和先前的信息一致,并且提供一份不准确性能证明, (二) 一个非变性证书, 一个非变性的测试性能证书,它声称输入数据是否足以计算一个独特的估计数。我们的第二个贡献是应用这个理论,设计一个新的基于学习的可验证的方位的方位。我们建议C-3PO,一个基于精度的图像估计模型, 与两个证书相加补充, 以解决估算问题。C-3PO还包含一个关键的准确性纸质的准确性测试, 在不同的测试中, 将显示一个可更新的自我校正度的证书。