Searching for objects is a fundamental skill for robots. As such, we expect object search to eventually become an off-the-shelf capability for robots, similar to e.g., object detection and SLAM. In contrast, however, no system for 3D object search exists that generalizes across real robots and environments. In this paper, building upon a recent theoretical framework that exploited the octree structure for representing belief in 3D, we present GenMOS (Generalized Multi-Object Search), the first general-purpose system for multi-object search (MOS) in a 3D region that is robot-independent and environment-agnostic. GenMOS takes as input point cloud observations of the local region, object detection results, and localization of the robot's view pose, and outputs a 6D viewpoint to move to through online planning. In particular, GenMOS uses point cloud observations in three ways: (1) to simulate occlusion; (2) to inform occupancy and initialize octree belief; and (3) to sample a belief-dependent graph of view positions that avoid obstacles. We evaluate our system both in simulation and on two real robot platforms. Our system enables, for example, a Boston Dynamics Spot robot to find a toy cat hidden underneath a couch in under one minute. We further integrate 3D local search with 2D global search to handle larger areas, demonstrating the resulting system in a 25m$^2$ lobby area.
翻译:对象搜索是机器人的基本技能。因此,我们期望对象搜索最终成为机器人的一种现成能力,类似于对象检测和SLAM。然而,尚不存在适用于真实机器人和环境的3D对象搜索系统。在本文中,我们建立在最近利用八叉树结构表示3D信念的理论框架之上,提出了GenMOS(通用多物体搜索),这是第一个在3D区域内具有机器人无关性和环境无关性的多物体搜索(MOS)通用系统。GenMOS的输入包括本地区域的点云观测,对象检测结果和机器人视图姿态的定位,并输出一个6D视图点,通过在线规划实现移动。特别地,GenMOS使用点云观测的三种方式:(1)模拟遮挡;(2)通知占用并初始化八叉树信念; (3)对避开障碍物的信念依赖图的视图位置进行采样。我们在仿真和两个真实机器人平台上评估了我们的系统。例如,我们的系统使波士顿动力Spot机器人能够在一分钟内找到隐藏在沙发下的玩具猫。我们进一步将3D本地搜索与2D全局搜索集成在一起,以处理更大的区域,并在25平方米的大厅区域中展示了所得到的系统。