与机器人学习统计保障相结合的 " 任务外分配探测 " (Task-Driven Out-of-Distribution Detection with Statistical Guarantees for Robot Learning)

Our goal is to perform out-of-distribution (OOD) detection, i.e., to detect when a robot is operating in environments that are drawn from a different distribution than the environments used to train the robot. We leverage Probably Approximately Correct (PAC)-Bayes theory in order to train a policy with a guaranteed bound on performance on the training distribution. Our key idea for OOD detection then relies on the following intuition: violation of the performance bound on test environments provides evidence that the robot is operating OOD. We formalize this via statistical techniques based on p-values and concentration inequalities. The resulting approach (i) provides guaranteed confidence bounds on OOD detection, and (ii) is task-driven and sensitive only to changes that impact the robot's performance. We demonstrate our approach on a simulated example of grasping objects with unfamiliar poses or shapes. We also present both simulation and hardware experiments for a drone performing vision-based obstacle avoidance in unfamiliar environments (including wind disturbances and different obstacle densities). Our examples demonstrate that we can perform task-driven OOD detection within just a handful of trials. Comparisons with baselines also demonstrate the advantages of our approach in terms of providing statistical guarantees and being insensitive to task-irrelevant distribution shifts.

翻译：我们的目标是进行分配外的检测,即检测机器人在与培训机器人的环境不同的分布环境中运行时,发现机器人在与培训机器人的环境不同的分布环境中运行。我们利用可能大致正确(PAC)-Bayes理论,以培训一项有一定约束的培训分布性能的政策。我们OOD检测的关键想法随后依赖于以下直觉:违反测试环境约束的性能,提供机器人运行OOD的证据。我们通过基于p-价值和集中不平等的统计技术正式确定这一点。由此产生的方法(一)为OOD检测提供有保障的信任界限,以及(二)只是任务驱动和敏感地适应影响机器人性能的变化。我们展示了我们以不熟悉的外形或形状捕捉物体的模拟例子。我们还为无人驾驶飞机在不熟悉的环境中(包括风扰和不同障碍密度)进行基于愿景的避免障碍的模拟和硬件实验。我们的例子表明,我们可以在少数试验中进行任务驱动的OOD检测。比较还表明,在提供统计保障和提供相关性任务分配方面,与基准相比,我们的方法在提供敏感性的任务分配方面,具有优势。