The localization of objects is a crucial task in various applications such as robotics, virtual and augmented reality, and the transportation of goods in warehouses. Recent advances in deep learning have enabled the localization using monocular visual cameras. While structure from motion (SfM) predicts the absolute pose from a point cloud, absolute pose regression (APR) methods learn a semantic understanding of the environment through neural networks. However, both fields face challenges caused by the environment such as motion blur, lighting changes, repetitive patterns, and feature-less structures. This study aims to address these challenges by incorporating additional information and regularizing the absolute pose using relative pose regression (RPR) methods. The optical flow between consecutive images is computed using the Lucas-Kanade algorithm, and the relative pose is predicted using an auxiliary small recurrent convolutional network. The fusion of absolute and relative poses is a complex task due to the mismatch between the global and local coordinate systems. State-of-the-art methods fusing absolute and relative poses use pose graph optimization (PGO) to regularize the absolute pose predictions using relative poses. In this work, we propose recurrent fusion networks to optimally align absolute and relative pose predictions to improve the absolute pose prediction. We evaluate eight different recurrent units and construct a simulation environment to pre-train the APR and RPR networks for better generalized training. Additionally, we record a large database of different scenarios in a challenging large-scale indoor environment that mimics a warehouse with transportation robots. We conduct hyperparameter searches and experiments to show the effectiveness of our recurrent fusion method compared to PGO.
翻译:目标定位是各种应用中的关键任务,如机器人技术、虚拟和增强现实、以及货物在仓库中的运输等。当前深度学习的进展使使用单眼视觉相机进行定位成为可能,而运动结构 (SfM) 从点云预测绝对姿态,绝对姿态回归 (APR) 方法通过神经网络学习环境的语义理解。但这两个领域都面临着由环境引起的挑战,如运动模糊、光照变化、重复的模式和缺少特征的结构。本研究旨在通过整合附加信息和利用相对姿态回归 (RPR) 方法来规范绝对姿态,以应对这些挑战。使用 Lucas-Kanade 算法计算相邻图像之间的光流,使用辅助小型循环卷积网络来预测相对姿态。融合绝对和相对姿态是复杂的任务,因为全局和局部坐标系之间存在不匹配。目前最先进的融合绝对和相对姿态的方法使用姿态图优化 (PGO) 来利用相对姿态规范绝对姿态预测。在本研究中,我们提出了循环融合网络,以优化对齐绝对姿态和相对姿态预测,从而提高绝对姿态预测的准确性。我们评估了八种不同的循环单元,并构建了一个模拟环境,对 APR 和 RPR 网络进行预训练,以获得更好的通用训练。此外,我们还在一个挑战性的大型室内环境中记录了不同场景的大型数据库,该环境类似于一个带有运输机器人的仓库。我们进行了超参数搜索和实验,显示了我们的循环融合方法相对于 PGO 的有效性。