Hydra:用于3D场景图建设和优化的实时空间感知系统 (Hydra: A Real-time Spatial Perception System for 3D Scene Graph Construction and Optimization)

3D scene graphs have recently emerged as a powerful high-level representation of 3D environments. A 3D scene graph describes the environment as a layered graph where nodes represent spatial concepts at multiple levels of abstraction and edges represent relations between concepts. While 3D scene graphs can serve as an advanced "mental model" for robots, how to build such a rich representation in real-time is still uncharted territory. This paper describes a real-time Spatial Perception System, a suite of algorithms to build a 3D scene graph from sensor data in real-time. Our first contribution is to develop real-time algorithms to incrementally construct the layers of a scene graph as the robot explores the environment; these algorithms build a local Euclidean Signed Distance Function (ESDF) around the current robot location, extract a topological map of places from the ESDF, and then segment the places into rooms using an approach inspired by community-detection techniques. Our second contribution is to investigate loop closure detection and optimization in 3D scene graphs. We show that 3D scene graphs allow defining hierarchical descriptors for loop closure detection; our descriptors capture statistics across layers in the scene graph, ranging from low-level visual appearance to summary statistics about objects and places. We then propose the first algorithm to optimize a 3D scene graph in response to loop closures; our approach relies on embedded deformation graphs to simultaneously correct all layers of the scene graph. We implement the proposed Spatial Perception System into a architecture named Hydra, that combines fast early and mid-level perception processes with slower high-level perception. We evaluate Hydra on simulated and real data and show it is able to reconstruct 3D scene graphs with an accuracy comparable with batch offline methods despite running online.

翻译：3D 场景图最近成为3D 环境的强大高层次代表。一个 3D 场景图将环境描述为一个层形图, 节点代表多个抽象层次的空间概念, 边缘代表各种概念之间的关系。虽然 3D 场景图可以作为机器人的高级“ 心理模型 ”, 如何在实时环境中建立如此丰富的场景图仍然是未知区域。本文描述一个实时空间感应系统, 一套从实时传感器数据构建 3D 场景图的3D 场景精度。我们的第一个贡献是开发实时算法, 在机器人探索环境时逐步构建场景图层层的层层; 这些算法可以在当前机器人位置周围建立一个本地的 Euclidean National Conform 模型( ESDFDF), 然后用社区感知觉技术激励的方法将各个房间分割成一个房间。我们提议的第二个贡献是在3D 场景图中调查循环关闭探测和优化的图像图。我们显示, 3D 的场景级图可以用来确定等级的底色图, 和直位图的直位图的直位图, 显示到直径平至直径平至直径结构, 我们的图的统计显示的图像结构图, 然后显示的图显示的图显示的图显示到直径图显示, 我们的图显示, 将显示到直位数级平至直到直到直向的图, 我们的图, 我们的图, 我们的直径图级图的直到直到直到直到直到直到直到显示的图层结构级图层结构图层图层图层结构图, 。