We present a strategy for designing and building very general robot manipulation systems involving the integration of a general-purpose task-and-motion planner with engineered and learned perception modules that estimate properties and affordances of unknown objects. Such systems are closed-loop policies that map from RGB images, depth images, and robot joint encoder measurements to robot joint position commands. We show that following this strategy a task-and-motion planner can be used to plan intelligent behaviors even in the absence of a priori knowledge regarding the set of manipulable objects, their geometries, and their affordances. We explore several different ways of implementing such perceptual modules for segmentation, property detection, shape estimation, and grasp generation. We show how these modules are integrated within the PDDLStream task and motion planning framework. Finally, we demonstrate that this strategy can enable a single system to perform a wide variety of real-world multi-step manipulation tasks, generalizing over a broad class of objects, object arrangements, and goals, without any prior knowledge of the environment and without re-training.
翻译:我们提出了一个设计和建设非常普通的机器人操作系统的战略,涉及将通用任务和动作规划器与设计并学习的感知模块结合起来,对未知物体的属性和承载能力进行估计。这些系统是闭路政策,从 RGB 图像、深度图像和机器人联合编码器测量图绘制成机器人联合定位命令。我们表明,根据这一战略,任务和动作规划器即使在没有事先掌握关于可操纵物体、其地理和负担能力的知识的情况下,也可以用于规划智能行为。我们探索了几种不同的方式,在分割、财产探测、形状估计和掌握生成方面实施这些概念模块。我们展示了这些模块如何融入PDDLStream任务和动作规划框架。最后,我们证明,这一战略能够使单一系统能够执行范围广泛的现实世界多步操纵任务,对广泛的物体、物体安排和目标进行概括化,而不事先对环境有任何了解,也不进行再培训。