Some recent visual-based relocalization algorithms rely on deep learning methods to perform camera pose regression from image data. This paper focuses on the loss functions that embed the error between two poses to perform deep learning based camera pose regression. Existing loss functions are either difficult-to-tune multi-objective functions or present unstable reprojection errors that rely on ground truth 3D scene points and require a two-step training. To deal with these issues, we introduce a novel loss function which is based on a multiplane homography integration. This new function does not require prior initialization and only depends on physically interpretable hyperparameters. Furthermore, the experiments carried out on well established relocalization datasets show that it minimizes best the mean square reprojection error during training when compared with existing loss functions.
翻译:最近一些基于视觉的重新定位算法依靠深层学习方法来进行照相机,从图像数据中回归。本文件侧重于将执行深层学习相机时的误差嵌入两种误差之间的损失函数。现有的损失函数要么是难以调制的多目标函数,要么是存在不稳定的重新预测错误,这些错误依赖于地面真相三维场点,需要两步培训。为了处理这些问题,我们引入了一个新的损失函数,该功能基于多平面同系集成。这一新功能不需要事先初始化,而只取决于物理解释的超参数。此外,在完善的重新定位数据集上进行的实验表明,与现有的损失函数相比,在培训期间,它最大限度地减少了平均的重新预测错误。