Successful visual navigation depends upon capturing images that contain sufficient useful information. In this letter, we explore a data-driven approach to account for environmental lighting changes, improving the quality of images for use in visual odometry (VO) or visual simultaneous localization and mapping (SLAM). We train a deep convolutional neural network model to predictively adjust camera gain and exposure time parameters such that consecutive images contain a maximal number of matchable features. The training process is fully self-supervised: our training signal is derived from an underlying VO or SLAM pipeline and, as a result, the model is optimized to perform well with that specific pipeline. We demonstrate through extensive real-world experiments that our network can anticipate and compensate for dramatic lighting changes (e.g., transitions into and out of road tunnels), maintaining a substantially higher number of inlier feature matches than competing camera parameter control algorithms.
翻译:成功视觉导航取决于捕捉含有足够有用信息的图像。 在本信里,我们探索一种数据驱动方法,以核算环境照明变化,提高用于视觉眼计量(VO)或视觉同步本地化和绘图(SLAM)的图像质量。我们训练了一个深刻的进化神经网络模型,以预测对摄像的增益和暴露时间参数的预测性调整,这样连续的图像包含大量相匹配的特征。培训过程完全由自己监督:我们的培训信号来自一个基本的VO或SLAM管道,因此,该模型被优化以配合该特定管道。我们通过广泛的现实世界实验表明,我们的网络可以预测和补偿巨大的照明变化(例如进出道路隧道的过渡),保持比相竞相机参数控制算法要高得多的内在特性匹配。