Can a robot manipulate intra-category unseen objects in arbitrary poses with the help of a mere demonstration of grasping pose on a single object instance? In this paper, we try to address this intriguing challenge by using USEEK, an unsupervised SE(3)-equivariant keypoints method that enjoys alignment across instances in a category, to perform generalizable manipulation. USEEK follows a teacher-student structure to decouple the unsupervised keypoint discovery and SE(3)-equivariant keypoint detection. With USEEK in hand, the robot can infer the category-level task-relevant object frames in an efficient and explainable manner, enabling manipulation of any intra-category objects from and to any poses. Through extensive experiments, we demonstrate that the keypoints produced by USEEK possess rich semantics, thus successfully transferring the functional knowledge from the demonstration object to the novel ones. Compared with other object representations for manipulation, USEEK is more adaptive in the face of large intra-category shape variance, more robust with limited demonstrations, and more efficient at inference time.
翻译:在本文中,我们试图通过使用一个不受监督的SEEEK(SEEEK)(一种不受监督的SE(3)-QQ等式关键点方法)来应对这一令人感兴趣的挑战,该关键点在某类中具有一致性,可以进行可普遍适用的操纵。USEEK遵循一个师生结构,将未经监督的关键点发现和SE(3)-QQQQ点探测区分开来。随着USEEEK手握手,该机器人能够以高效和可解释的方式推断与类别有关的任务对象框架,从而能够将任何类别内物体从和从任何方面加以操纵。通过广泛的实验,我们证明USEEEK制作的关键点拥有丰富的语义学,从而成功地将功能知识从演示对象转移到新事物中。与其他用于操纵的物体表述相比,USEEEEK在面临大型类内形差异时更具适应性,在有限的演示中更加有力,在推断时间里更有效率。