Nighttime thermal infrared (NTIR) image colorization, also known as translation of NTIR images into daytime color images (NTIR2DC), is a promising research direction to facilitate nighttime scene perception for humans and intelligent systems under unfavorable conditions (e.g., complete darkness). However, previously developed methods have poor colorization performance for small sample classes. Moreover, reducing the high confidence noise in pseudo-labels and addressing the problem of image gradient disappearance during translation are still under-explored, and keeping edges from being distorted during translation is also challenging. To address the aforementioned issues, we propose a novel learning framework called Memory-guided cOllaboRative atteNtion Generative Adversarial Network (MornGAN), which is inspired by the analogical reasoning mechanisms of humans. Specifically, a memory-guided sample selection strategy and adaptive collaborative attention loss are devised to enhance the semantic preservation of small sample categories. In addition, we propose an online semantic distillation module to mine and refine the pseudo-labels of NTIR images. Further, conditional gradient repair loss is introduced for reducing edge distortion during translation. Extensive experiments on the NTIR2DC task show that the proposed MornGAN significantly outperforms other image-to-image translation methods in terms of semantic preservation and edge consistency, which helps improve the object detection accuracy remarkably.
翻译:夜间热红外(NTIR)图像色化,也称为将NTIR图像转换成日间彩色图像(NTIR2DC),是一个很有希望的研究方向,有助于在不利的条件下(如完全黑暗)对人和智能系统进行夜间场景感知;然而,以前制定的方法对小样本类的彩色性能不好;此外,降低伪标签中的高度信心噪音和解决翻译过程中图像梯度消失的问题,仍在探索之中,在翻译过程中保持偏差也具有挑战性。为了解决上述问题,我们提议了一个名为 " 内存-制导的COOROBRATION 动能感化反向网络 " 的新学习框架(MornGAN),这个框架受人类模拟推理机制的启发。具体地说,设计了一种记忆制导样本选择策略和适应性协作性关注损失,以加强对小样本类的语义保护。此外,我们提议为矿场提供在线语义蒸馏模块,并改进NTIR对象图像的假标签。此外,在人类模拟感化中引入了固定性梯度修复损失,以大幅降低磁变变。