Dynamic environments that include unstructured moving objects pose a hard problem for Simultaneous Localization and Mapping (SLAM) performance. The motion of rigid objects can be typically tracked by exploiting their texture and geometric features. However, humans moving in the scene are often one of the most important, interactive targets - they are very hard to track and reconstruct robustly due to non-rigid shapes. In this work, we present a fast, learning-based human object detector to isolate the dynamic human objects and realise a real-time dense background reconstruction framework. We go further by estimating and reconstructing the human pose and shape. The final output environment maps not only provide the dense static backgrounds but also contain the dynamic human meshes and their trajectories. Our Dynamic SLAM system runs at around 26 frames per second (fps) on GPUs, while additionally turning on accurate human pose estimation can be executed at up to 10 fps.
翻译:包含非结构移动物体的动态环境对同步定位和绘图( SLAM) 性能构成一个棘手的问题。 僵硬物体的运动一般可以通过利用它们的纹理和几何特征来跟踪。 然而, 人类在现场移动往往是最重要的互动目标之一, 由于非固定形状,它们很难追踪和进行强力重建。 在这项工作中, 我们提出了一个基于学习的快速人类物体探测器, 以隔离动态人类物体, 并实现实时密集的背景重建框架 。 我们通过估计和重组人类的外形和形状更进一步。 最后的输出环境映射不仅提供了密集的静态背景, 而且还包含了动态人类的外观及其轨迹。 我们的动态 SLAM 系统运行在GPU每秒( fps) 大约26 个框架左右, 同时, 还可以在最多 10 英尺处进行 准确的人体外观估计 。