Detecting objects of interest, such as human survivors, safety equipment, and structure access points, is critical to any search-and-rescue operation. Robots deployed for such time-sensitive efforts rely on their onboard sensors to perform their designated tasks. However, as disaster response operations are predominantly conducted under perceptually degraded conditions, commonly utilized sensors such as visual cameras and LiDARs suffer in terms of performance degradation. In response, this work presents a method that utilizes the complementary nature of vision and depth sensors to leverage multi-modal information to aid object detection at longer distances. In particular, depth and intensity values from sparse LiDAR returns are used to generate proposals for objects present in the environment. These proposals are then utilized by a Pan-Tilt-Zoom (PTZ) camera system to perform a directed search by adjusting its pose and zoom level for performing object detection and classification in difficult environments. The proposed work has been thoroughly verified using an ANYmal quadruped robot in underground settings and on datasets collected during the DARPA Subterranean Challenge finals.
翻译:探测人类幸存者、安全设备和结构接入点等受关注对象对于任何搜索和救援行动都至关重要。为这种时间敏感工作部署的机器人依靠其机载传感器执行指定任务。然而,由于救灾行动主要是在感知退化的条件下进行的,通常使用的传感器,如视觉摄像机和激光雷达仪等,在性能退化方面受到损害。作为回应,这项工作提出了一个方法,利用视觉和深度传感器的互补性,利用多式传感器来利用多式信息协助远距离的物体探测。特别是,稀疏的激光雷达雷达返回的深度和强度值来为环境中的物体提出建议。然后,泛点Zoom(PTZ)照相机系统利用这些提议进行定向搜索,调整其布局和缩放水平,以便在困难的环境中进行物体探测和分类。所提议的工作已经使用地下环境中的Amal四重机器人和在DARPRPA Subterrane 挑战决赛中收集的数据集进行了彻底核实。