多级机器人场景了解多维视集 (Multi-View Fusion for Multi-Level Robotic Scene Understanding)

We present a system for multi-level scene awareness for robotic manipulation. Given a sequence of camera-in-hand RGB images, the system calculates three types of information: 1) a point cloud representation of all the surfaces in the scene, for the purpose of obstacle avoidance; 2) the rough pose of unknown objects from categories corresponding to primitive shapes (e.g., cuboids and cylinders); and 3) full 6-DoF pose of known objects. By developing and fusing recent techniques in these domains, we provide a rich scene representation for robot awareness. We demonstrate the importance of each of these modules, their complementary nature, and the potential benefits of the system in the context of robotic manipulation.

翻译：我们提出了一个多层次的机器人操控场景意识系统。根据一系列摄像式 RGB 图像的顺序,该系统计算出三种信息:1) 场面所有表面的点云表,以避免障碍;2) 与原始形状(如幼崽和气瓶)相对应的类别不明物体的粗形;3) 已知物体的6-DoF 完整形状。我们通过在这些方面开发和发挥最新技术,为机器人意识提供了丰富的场景表现。我们展示了每个模块的重要性、其互补性质以及系统在机器人操纵方面的潜在好处。