Navigation inside a closed area with no GPS-signal accessibility is a highly challenging task. In order to tackle this problem, recently the imaging-based methods have grabbed the attention of many researchers. These methods either extract the features (e.g. using SIFT, or SOSNet) and map the descriptive ones to the camera position and rotation information, or deploy an end-to-end system that directly estimates this information out of RGB images, similar to PoseNet. While the former methods suffer from heavy computational burden during the test process, the latter suffers from lack of accuracy and robustness against environmental changes and object movements. However, end-to-end systems are quite fast during the test and inference and are pretty qualified for real-world applications, even though their training phase could be longer than the former ones. In this paper, a novel multi-modal end-to-end system for large-scale indoor positioning has been proposed, namely APS (Alpha Positioning System), which integrates a Pix2Pix GAN network to reconstruct the point cloud pair of the input query image, with a deep CNN network in order to robustly estimate the position and rotation information of the camera. For this integration, the existing datasets have the shortcoming of paired RGB/point cloud images for indoor environments. Therefore, we created a new dataset to handle this situation. By implementing the proposed APS system, we could achieve a highly accurate camera positioning with a precision level of less than a centimeter.
翻译:在一个没有GPS- 信号无障碍的封闭区域内的导航系统是一项极具挑战性的任务。 为了解决这一问题,最近基于成像的方法已经吸引了许多研究人员的注意。这些方法要么提取特征(例如使用SIFT,或SOSNet),将描述性系统映射到相机的位置和旋转信息,要么将描述性系统映射到相机的位置和旋转信息,或者部署一个端对端系统,直接从RGB图像(类似于PoseNet)中估算这些信息,类似于PoseNet。虽然以前的方法在测试过程中有沉重的计算负担,但前者在环境变化和物体移动方面缺乏准确性和稳健性。然而,在测试和推断期间,端对端系统相当快,而且非常适合现实世界应用。在本文件中,提出了一个新的大型室内定位多模式端对端系统,即APS(Apha 定位系统),它可以结合一个 Pix2Pix GAN 网络来重建输入查询图像的点对焦云。一个更深的CNN网络在测试和深度的网络中非常快速的精确度上非常适合现实应用应用,尽管它们的训练阶段可能比新的数据定位环境更精确地估计,但现在的 RRC- 正在建立一个新的数据环境,从而实现新的图像,从而实现新的定位。