Most of the deep-learning based depth and ego-motion networks have been designed for visible cameras. However, visible cameras heavily rely on the presence of an external light source. Therefore, it is challenging to use them under low-light conditions such as night scenes, tunnels, and other harsh conditions. A thermal camera is one solution to compensate for this problem because it detects Long Wave Infrared Radiation(LWIR) regardless of any external light sources. However, despite this advantage, both depth and ego-motion estimation research for the thermal camera are not actively explored until so far. In this paper, we propose an unsupervised learning method for the all-day depth and ego-motion estimation. The proposed method exploits multi-spectral consistency loss to gives complementary supervision for the networks by reconstructing visible and thermal images with the depth and pose estimated from thermal images. The networks trained with the proposed method robustly estimate the depth and pose from monocular thermal video under low-light and even zero-light conditions. To the best of our knowledge, this is the first work to simultaneously estimate both depth and ego-motion from the monocular thermal video in an unsupervised manner.
翻译:大部分深层学习的深层和自我感动网络都是为可见相机设计的。然而,可见相机在很大程度上依赖于外部光源的存在。 因此,在诸如夜景、隧道等低光条件下使用这些光源具有挑战性。 热相机是弥补这一问题的一种解决办法,因为它能探测长波红外辐射(LWIR),而不论外部光源如何。然而,尽管有这一优势,但热相机的深度和自我感动估计研究直到目前还没有得到积极探索。 在本文中,我们建议采用一种不受监督的全天深度和自我感动估计学习方法。拟议方法利用多光谱一致性损失,利用深度重建可见和热的图像来补充对网络的监督,并从热图像中作出估计。 接受拟议方法培训的网络在低光度甚至零光条件下对单层热视频的深度和表面进行强有力估计。 据我们所知,这是第一次以不严密的眼光同时估计单层热能视频的深度和自我感动。