Active stereo systems are widely used in the robotics industry due to their low cost and high quality depth maps. These depth sensors, however, suffer from stereo artefacts and do not provide dense depth estimates. In this work, we present the first self-supervised depth completion method for active stereo systems that predicts accurate dense depth maps. Our system leverages a feature-based visual inertial SLAM system to produce motion estimates and accurate (but sparse) 3D landmarks. The 3D landmarks are used both as model input and as supervision during training. The motion estimates are used in our novel reconstruction loss that relies on a combination of passive and active stereo frames, resulting in significant improvements in textureless areas that are common in indoor environments. Due to the non-existence of publicly available active stereo datasets, we release a real dataset together with additional information for a publicly available synthetic dataset needed for active depth completion and prediction. Through rigorous evaluations we show that our method outperforms state of the art on both datasets. Additionally we show how our method obtains more complete, and therefore safer, 3D maps when used in a robotic platform
翻译:在这项工作中,我们为预测精确密度深度地图的动态立体系统提供了第一个自我监督的深度完成方法。我们的系统利用基于地貌的视觉惯性SLAM系统来进行运动估计和准确(但稀少)的3D里程碑。3D里程碑用作模型输入和训练中的监督。运动估计用于我们新的重建损失,这种损失依赖于被动和活跃立体框架的组合,导致室内环境中常见的无纹区域的重大改进。由于不存在公开可用的主动立体数据集,我们发布真实数据集,并发布更多资料,用于公开提供的用于积极深度完成和预测所需的合成数据集。通过严格的评估,我们显示我们的方法超越了两个数据集的艺术状态。此外,我们展示了我们的方法如何更加完整,因此在机器人平台上使用3D地图时如何更加安全。