The use of multi-camera views simultaneously has been shown to improve the generalization capabilities and performance of visual policies. However, the hardware cost and design constraints in real-world scenarios can potentially make it challenging to use multiple cameras. In this study, we present a novel approach to enhance the generalization performance of vision-based Reinforcement Learning (RL) algorithms for robotic manipulation tasks. Our proposed method involves utilizing a technique known as knowledge distillation, in which a pre-trained ``teacher'' policy trained with multiple camera viewpoints guides a ``student'' policy in learning from a single camera viewpoint. To enhance the student policy's robustness against camera location perturbations, it is trained using data augmentation and extreme viewpoint changes. As a result, the student policy learns robust visual features that allow it to locate the object of interest accurately and consistently, regardless of the camera viewpoint. The efficacy and efficiency of the proposed method were evaluated both in simulation and real-world environments. The results demonstrate that the single-view visual student policy can successfully learn to grasp and lift a challenging object, which was not possible with a single-view policy alone. Furthermore, the student policy demonstrates zero-shot transfer capability, where it can successfully grasp and lift objects in real-world scenarios for unseen visual configurations.
翻译:同时使用多相机观点,可以提高视觉政策的普及能力和性能。然而,现实世界情景中的硬件成本和设计限制可能会使使用多摄像头变得具有挑战性。在本研究中,我们提出了一种新颖的方法,以提高基于视觉的强化学习算法(RL)在机器人操作任务方面的通用性能。我们提出的方法涉及使用一种称为知识蒸馏的技术,在这种技术中,受过多种相机观点培训的经过预先训练的“教师政策”指导了“学生从单一相机角度学习的政策。为了提高学生政策对摄影机位置的稳健性,学生政策通过数据增强和极端观点变化进行培训。结果,学生政策学会了一种强健的视觉特征,使其能够准确和一贯地定位兴趣对象,而不论摄影师的观点如何。在模拟和现实世界环境中都对拟议方法的效能和效率进行了评估。结果显示,单视角学生的视觉政策能够成功地学会如何从一个摄取和提升一个具有挑战性的对象,而光学是不可能的,而单视角政策则无法用单一的视觉政策来进行。此外,学生政策能够成功地在真实的视野中显示,从而显示学生的视野图像定位定位。</s>