Point clouds are a widely available and canonical data modality which convey the 3D geometry of a scene. Despite significant progress in classification and segmentation from point clouds, policy learning from such a modality remains challenging, and most prior works in imitation learning focus on learning policies from images or state information. In this paper, we propose a novel framework for learning policies from point clouds for robotic manipulation with tools. We use a novel neural network, ToolFlowNet, which predicts dense per-point flow on the tool that the robot controls, and then uses the flow to derive the transformation that the robot should execute. We apply this framework to imitation learning of challenging deformable object manipulation tasks with continuous movement of tools, including scooping and pouring, and demonstrate significantly improved performance over baselines which do not use flow. We perform 50 physical scooping experiments with ToolFlowNet and attain 82% scooping success. See https://tinyurl.com/toolflownet for supplementary material.
翻译:点云是一种可广泛获取的、 能够传达场景三维几何的典型数据模式。 尽管在从点云进行分类和分解方面取得了重大进展, 从这种模式中进行的政策学习仍然具有挑战性, 以往在模仿学习中的大部分工作侧重于从图像或国家信息中学习政策。 在本文中,我们提出了一个从点云学习政策以便用工具进行机器人操作的新框架。 我们使用一个新颖的神经网络TolFlowNet, 它预测机器人控制的工具上密集的每点流, 然后利用该流来获取机器人应该执行的转换。 我们应用这个框架来模拟以持续移动工具(包括挖掘和倾注)的方式挑战变形物体操作任务, 并展示显著改进的不使用流动的基线。 我们用工具FlowNet进行50次物理挖掘实验, 并成功82% 。 见 补充材料 https://tinyurl.com/toolflownet 。