Expert demonstrations are a rich source of supervision for training visual robotic manipulation policies, but imitation learning methods often require either a large number of demonstrations or expensive online expert supervision to learn reactive closed-loop behaviors. In this work, we introduce SPARTN (Synthetic Perturbations for Augmenting Robot Trajectories via NeRF): a fully-offline data augmentation scheme for improving robot policies that use eye-in-hand cameras. Our approach leverages neural radiance fields (NeRFs) to synthetically inject corrective noise into visual demonstrations, using NeRFs to generate perturbed viewpoints while simultaneously calculating the corrective actions. This requires no additional expert supervision or environment interaction, and distills the geometric information in NeRFs into a real-time reactive RGB-only policy. In a simulated 6-DoF visual grasping benchmark, SPARTN improves success rates by 2.8$\times$ over imitation learning without the corrective augmentations and even outperforms some methods that use online supervision. It additionally closes the gap between RGB-only and RGB-D success rates, eliminating the previous need for depth sensors. In real-world 6-DoF robotic grasping experiments from limited human demonstrations, our method improves absolute success rates by $22.5\%$ on average, including objects that are traditionally challenging for depth-based methods. See video results at \url{https://bland.website/spartn}.
翻译:专家演示是培训视觉机器人操纵政策的丰富监督来源,但模仿学习方法往往需要大量的演示或昂贵的在线专家监督来学习反应式闭路行为。在这项工作中,我们引入了SPARTN(通过 NERF 增强机器人轨迹的合成干扰) : 一个完全关闭的数据增强计划,用于改进使用眼中照相机的机器人政策。我们的方法将神经光亮场(NERRFs)用于合成地将纠正噪音注入视觉演示,使用NERFs来生成透视视角,同时计算纠正行动。这不需要额外的专家监督或环境互动,并将NERFs中的几何物体提炼成实时反应式RGBF(仅使用NRRF):在模拟的6-DoF视觉抓取基准中,SPARTN将成功率提高2.8美元,而没有进行纠正性放大,甚至超越了某些在线监督方法。它进一步缩小了RGB-JUR-D成功率与RGB-D成功率之间的差距,并且消除了我们以往的深度传感器的绝对成功率。