Absolute pose regressor (APR) networks are trained to estimate the pose of the camera given a captured image. They compute latent image representations from which the camera position and orientation are regressed. APRs provide a different tradeoff between localization accuracy, runtime, and memory, compared to structure-based localization schemes that provide state-of-the-art accuracy. In this work, we introduce Camera Pose Auto-Encoders (PAEs), multilayer perceptrons that are trained via a Teacher-Student approach to encode camera poses using APRs as their teachers. We show that the resulting latent pose representations can closely reproduce APR performance and demonstrate their effectiveness for related tasks. Specifically, we propose a light-weight test-time optimization in which the closest train poses are encoded and used to refine camera position estimation. This procedure achieves a new state-of-the-art position accuracy for APRs, on both the CambridgeLandmarks and 7Scenes benchmarks. We also show that train images can be reconstructed from the learned pose encoding, paving the way for integrating visual information from the train set at a low memory cost. Our code and pre-trained models are available at https://github.com/yolish/camera-pose-auto-encoders.
翻译:绝对回归器( APR) 网络经过培训, 以估计摄像头所摄图像的形状。 它们计算潜在图像显示, 相机的位置和方向会从中倒退。 与提供最新精确度的基于结构的本地化计划相比, APR 网络在本地化精度、 运行时间和记忆之间提供了不同的权衡。 在此工作中, 我们引入了相机 Pose Auto- Eccoders( PAEE), 多层感应器, 通过师资- 师资方法, 将相机显示成像进行编码, 使用 PRA 教师 。 我们显示, 由此产生的潜在图像显示, 能够密切复制 PAR 的性能和显示其相关任务的有效性 。 具体地说, 我们提议了一种轻量的测试时间优化, 最接近的列列车配置被编码并用于改进相机位置估计。 这个程序在剑桥Landmarks 和 7Scenes 基准上都实现了新的状态定位精确度 。 我们还显示, 可以从所学的配置图像编码中重建, 为低存储成本/ AM 。 我们的代码和 。 我们的代码和 将 AM 。