通用对象搜索 (Generalized Object Search)

Future collaborative robots must be capable of finding objects. As such a fundamental skill, we expect object search to eventually become an off-the-shelf capability for any robot, similar to e.g., object detection, SLAM, and motion planning. However, existing approaches either make unrealistic compromises (e.g., reduce the problem from 3D to 2D), resort to ad-hoc, greedy search strategies, or attempt to learn end-to-end policies in simulation that are yet to generalize across real robots and environments. This thesis argues that through using Partially Observable Markov Decision Processes (POMDPs) to model object search while exploiting structures in the human world (e.g., octrees, correlations) and in human-robot interaction (e.g., spatial language), a practical and effective system for generalized object search can be achieved. In support of this argument, I develop methods and systems for (multi-)object search in 3D environments under uncertainty due to limited field of view, occlusion, noisy, unreliable detectors, spatial correlations between objects, and possibly ambiguous spatial language (e.g., "The red car is behind Chase Bank"). Besides evaluation in simulators such as PyGame, AirSim, and AI2-THOR, I design and implement a robot-independent, environment-agnostic system for generalized object search in 3D and deploy it on the Boston Dynamics Spot robot, the Kinova MOVO robot, and the Universal Robots UR5e robotic arm, to perform object search in different environments. The system enables, for example, a Spot robot to find a toy cat hidden underneath a couch in a kitchen area in under one minute. This thesis also broadly surveys the object search literature, proposing taxonomies in object search problem settings, methods and systems.

翻译：未来合作机器人必须能够找到目标。作为这样一个基本技能, 我们期待目标搜索最终成为任何机器人的现成能力, 类似物体探测、 SLAM 和运动规划。但是, 现有的方法要么不切实际地做出妥协( 将问题从 3D 减少到 2D ), 采用临时的、贪婪的搜索策略, 或者试图在模拟中学习端到端的政策, 而这些模拟还有待在真实的机器人和环境中推广。这个理论表明, 通过使用部分可观测的 Markov 决策进程( POMDPs) 来模拟物体搜索, 同时利用人类世界( 例如, octrees, 相关联) 和人类机器人互动( 例如, 将问题从 3D 中减少 ), 利用部分可观测的 Markov 决策进程( POMDPs), 模拟物体之间的物体搜索模型, 以及可能的话直径的, 直径的 Oirbio 系统, 和直径的Orbal,, 内部的Oral- 系统,, 等的Oral-,, 和直流的Oral- 系统,,, 的Oral-, 的搜索,, 和O, 和O, 等的S- 直流的搜索,, 系统,,,,, 直流的O,,,,, 直流系统, 直流系统, 运行的,,,,,,,,, 直流的,, 运行的,, 运行的, 运行的, 运行的,, 。, 。,,,,,, 直,,, 直,,, 直,, 直, 直, 直, 直,, 直, 直, 直, 直, 直, 直, 直, 直, 直, 直, 直, 直, 直, 直, 直, 直, 直,, 直, 直, 直, 直, 直,