Robotic fabric manipulation has applications in home robotics, textiles, senior care and surgery. Existing fabric manipulation techniques, however, are designed for specific tasks, making it difficult to generalize across different but related tasks. We build upon the Visual Foresight framework to learn fabric dynamics that can be efficiently reused to accomplish different sequential fabric manipulation tasks with a single goal-conditioned policy. We extend our earlier work on VisuoSpatial Foresight (VSF), which learns visual dynamics on domain randomized RGB images and depth maps simultaneously and completely in simulation. In this earlier work, we evaluated VSF on multi-step fabric smoothing and folding tasks against 5 baseline methods in simulation and on the da Vinci Research Kit (dVRK) surgical robot without any demonstrations at train or test time. A key finding was that depth sensing significantly improves performance: RGBD data yields an 80% improvement in fabric folding success rate in simulation over pure RGB data. In this work, we vary 4 components of VSF, including data generation, visual dynamics model, cost function, and optimization procedure. Results suggest that training visual dynamics models using longer, corner-based actions can improve the efficiency of fabric folding by 76% and enable a physical sequential fabric folding task that VSF could not previously perform with 90% reliability. Code, data, videos, and supplementary material are available at https://sites.google.com/view/fabric-vsf/.
翻译:机器人结构操纵在家庭机器人、纺织品、高级护理和外科手术中都有应用。但是,现有的结构操纵技术是为具体任务设计的,因此难以在不同的相关任务中加以推广。我们利用视觉视野框架学习能够有效再利用的结构动态,以完成不同的连续结构操纵任务,采用单一的有目标限制的政策。我们扩大了我们早先在VisuoSpatial Foresight(VSF)方面的工作,即同时和完全在模拟中学习域随机 RGB图像和深度地图的视觉动态。在先前的工作中,我们根据模拟和达芬奇研究工具包(DVRK)的5个基线方法,评估了多步制结构、平滑和折叠任务。我们的一项关键发现是,深度感测大大提高了性能:RGBD数据在模拟纯RGBV数据时使结构的折叠率提高了80%。我们的工作涉及VSFSF的4个组成部分,包括数据生成、视觉动态模型、成本功能和优化程序。结果显示,在模拟和达芬奇研究工具包中,使用更长期、更角/更隐含76级的图像动作动作动作,可以使VFLFLFI/FLFLF的功能结构能提高。前的功能结构能提高。可以使VFLFLFLFD/FLFLFLFLFLFD/FLFLFLFLFLF的功能提高效率。