We present Neural Descriptor Fields (NDFs), an object representation that encodes both points and relative poses between an object and a target (such as a robot gripper or a rack used for hanging) via category-level descriptors. We employ this representation for object manipulation, where given a task demonstration, we want to repeat the same task on a new object instance from the same category. We propose to achieve this objective by searching (via optimization) for the pose whose descriptor matches that observed in the demonstration. NDFs are conveniently trained in a self-supervised fashion via a 3D auto-encoding task that does not rely on expert-labeled keypoints. Further, NDFs are SE(3)-equivariant, guaranteeing performance that generalizes across all possible 3D object translations and rotations. We demonstrate learning of manipulation tasks from few (5-10) demonstrations both in simulation and on a real robot. Our performance generalizes across both object instances and 6-DoF object poses, and significantly outperforms a recent baseline that relies on 2D descriptors. Project website: https://yilundu.github.io/ndf/.
翻译:我们提出神经描述字段(NDFs),这是一个通过分类描述器对一个对象和目标(如机器人抓抓器或用于悬吊的架子)之间的点数和相对值进行编码的物体表示,通过分类描述器对一个对象和目标(如机器人抓抓器或用于悬吊的架子)进行编码。我们使用这种表示器进行物体操纵,在给任务演示时,我们想在同一类别的新对象实例上重复同样的任务。我们提议通过搜索(优化)显示演示中观察到的描述符匹配的外形来实现这一目标。NDFs通过3D自动编码任务进行自我监督式训练,不依赖专家标记的键点。此外,NDFs是S(3)-equivariant,保证所有3D物体翻译和旋转都具有通用性。我们在模拟和真正的机器人上都学习了少数(5-10)的操纵任务。我们的表现在对象实例和6-DoF对象中都作了一般化,大大超越了依赖2D描述器的最近基线。项目网站: https://yil/funtuction.gith.