Benefitting from insensitivity to light and high penetration of foggy environments, infrared cameras are widely used for sensing in nighttime traffic scenes. However, the low contrast and lack of chromaticity of thermal infrared (TIR) images hinder the human interpretation and portability of high-level computer vision algorithms. Colorization to translate a nighttime TIR image into a daytime color (NTIR2DC) image may be a promising way to facilitate nighttime scene perception. Despite recent impressive advances in image translation, semantic encoding entanglement and geometric distortion in the NTIR2DC task remain under-addressed. Hence, we propose a toP-down attEntion And gRadient aLignment based GAN, referred to as PearlGAN. A top-down guided attention module and an elaborate attentional loss are first designed to reduce the semantic encoding ambiguity during translation. Then, a structured gradient alignment loss is introduced to encourage edge consistency between the translated and input images. In addition, pixel-level annotation is carried out on a subset of FLIR and KAIST datasets to evaluate the semantic preservation performance of multiple translation methods. Furthermore, a new metric is devised to evaluate the geometric consistency in the translation process. Extensive experiments demonstrate the superiority of the proposed PearlGAN over other image translation methods for the NTIR2DC task. The source code and labeled segmentation masks will be available at \url{https://github.com/FuyaLuo/PearlGAN/}.
翻译:红外摄影机从对光的不敏感和雾化环境的高渗透中受益,因此在夜间交通场景中广泛使用红外照相机进行感测,然而,热红红外图像的低对比度和缺乏色度妨碍了高级计算机视觉算法的人类判读和可携带性。将夜间TIR图像转换成日间颜色(NTIR2DC)图像的色彩化可能是便利对夜间景色感知的一个很有希望的方法。尽管最近在图像翻译、语义编码纠结和NTIR2DC任务中的几何扭曲方面取得了令人印象深刻的进展,但该任务仍未得到充分处理。因此,我们提议在以PearGAN(称为PearGAN)为基础的GAN上下调和GRadient Aignment的调调色调低,这是个自上下引引引引引导的模块和精心造成的关注损失,首先是为了减少翻译过程中的语义的含混杂不清性。随后,引入一个结构化梯度调整损失,以鼓励翻译和输入源图像图像之间的边缘一致性。此外,在FIR和KISIR数据库的数据分析中将进行一个小节段的翻译。