Visual navigation and three-dimensional (3D) scene reconstruction are essential for robotics to interact with the surrounding environment. Large-scale scenes and critical camera motions are great challenges facing the research community to achieve this goal. We raised a pose-only imaging geometry framework and algorithms that can help solve these challenges. The representation is a linear function of camera global translations, which allows for efficient and robust camera motion estimation. As a result, the spatial feature coordinates can be analytically reconstructed and do not require nonlinear optimization. Experiments demonstrate that the computational efficiency of recovering the scene and associated camera poses is significantly improved by 2-4 orders of magnitude. This solution might be promising to unlock real-time 3D visual computing in many forefront applications.
翻译:视觉导航和三维(3D)场景重建对于机器人与周围环境互动至关重要。大型场景和关键的摄影机动作是研究界要实现这一目标所面临的巨大挑战。我们提出了一个只显示表面的成像几何框架和算法,可以帮助应对这些挑战。这种表达方式是全球照相机翻译的线性功能,可以进行有效和稳健的摄影机动作估计。因此,空间地物坐标可以进行分析重建,不需要非线性优化。实验表明,恢复现场及相关相机的计算效率大大提高2-4级。这一解决方案有望在许多前沿应用中打开实时的三维视觉计算。