Real-time object pose estimation is necessary for many robot manipulation algorithms. However, state-of-the-art methods for object pose estimation are trained for a specific set of objects; these methods thus need to be retrained to estimate the pose of each new object, often requiring tens of GPU-days of training for optimal performance. \revisef{In this paper, we propose the OSSID framework,} leveraging a slow zero-shot pose estimator to self-supervise the training of a fast detection algorithm. This fast detector can then be used to filter the input to the pose estimator, drastically improving its inference speed. We show that this self-supervised training exceeds the performance of existing zero-shot detection methods on two widely used object pose estimation and detection datasets, without requiring any human annotations. Further, we show that the resulting method for pose estimation has a significantly faster inference speed, due to the ability to filter out large parts of the image. Thus, our method for self-supervised online learning of a detector (trained using pseudo-labels from a slow pose estimator) leads to accurate pose estimation at real-time speeds, without requiring human annotations. Supplementary materials and code can be found at https://georgegu1997.github.io/OSSID/
翻译:许多机器人操纵算法都需要实时对象估计。 但是, 最先进的物体估计方法需要为一组特定物体进行训练; 因此, 需要对这些方法进行再培训, 以估计每个新物体的构成, 通常需要数十个GPU- 日的培训才能优化性能。 本文中, 我们提议使用一个慢速零发的天体估计算法来自我监督快速探测算法的培训。 这样, 我们的快速探测器就可以用来过滤向显示器输入的输入, 大大提高它的推断速度。 我们显示, 这种自我监督的培训超过了两种广泛使用的物体上现有的零发探测方法的性能, 进行估计和探测数据集, 而不需要任何人文说明。 此外, 我们表明, 由此得出的估计方法的推断速度要快得多, 是因为能够过滤图像的大片部分。 因此, 我们的自我监督在线学习探测器的方法( 在不使用变相标签的情况下, 使用慢速的图像/ / ASUSER / / org ) 可以在不要求真实速度的图像/ ASimimimimimate imal imateal coal 找到精确的路径。