A spatial AI that can perform complex tasks through visual signals and cooperate with humans is highly anticipated. To achieve this, we need a visual SLAM that easily adapts to new scenes without pre-training and generates dense maps for downstream tasks in real-time. None of the previous learning-based and non-learning-based visual SLAMs satisfy all needs due to the intrinsic limitations of their components. In this work, we develop a visual SLAM named Orbeez-SLAM, which successfully collaborates with implicit neural representation and visual odometry to achieve our goals. Moreover, Orbeez-SLAM can work with the monocular camera since it only needs RGB inputs, making it widely applicable to the real world. Results show that our SLAM is up to 800x faster than the strong baseline with superior rendering outcomes. Code link: https://github.com/MarvinChung/Orbeez-SLAM.
翻译:为实现这一目标,我们需要一个视觉的SLAM系统,它可以很容易地适应新的场景而无需预先培训,并生成用于实时下游任务的密集地图。以往的基于学习和非基于学习的SLAM系统都无法满足所有需求,因为其组成部分的内在局限性。在这项工作中,我们开发了一个名为Orbeez-SLAM的视觉SLM系统,它与隐含的神经代言和视觉观察仪成功合作,以实现我们的目标。此外,Orbeez-SLAM系统可以与单镜相机合作,因为它只需要RGB投入,使其广泛适用于现实世界。结果显示,我们的SLAM系统比强的基线速度高达800倍,并且具有更高的效果。代码链接:https://github.com/MarvinChung/Orbeez-SLAM。