DDColor: 通过双解码器实现照片级真实感和语义感知的图像上色 (DDColor: Towards Photo-Realistic and Semantic-Aware Image Colorization via Dual Decoders)

Automatic image colorization is a challenging problem. Due to the high illness and multi-modal uncertainty, directly training a deep neural network usually leads to incorrect semantic colors and low color richness. Recent transformer-based methods can deliver better results, but they often rely on manually designed priors, which are hard to implement and suffer from poor generalization ability. Moreover, they tend to introduce serious color bleeding effects since color attention is performed on single-scale features, thus fail to exploit sufficient semantic information. To address these issues, we propose DDColor, a new end-to-end method with dual decoders for image colorization. Our approach includes a multi-scale image decoder and a transformer-based color decoder. The former restores the spatial resolution of the image, while the latter establishes the correlation between color and semantic representations via cross-attention. Rather than using additional priors, our two decoders work together to leverage multi-scale image features to guide optimization of adaptive color queries, significantly alleviating color bleeding effects. In addition, a simple yet effective colorfulness loss is introduced to further enhance the color richness of generated results. Our extensive experiments demonstrate that DDColor achieves significantly superior performance to existing state-of-the-art works both quantitatively and qualitatively. Codes will be made publicly available at https://github.com/piddnad/DDColor.

翻译：自动图像上色是一个具有挑战性的问题。由于高度疾病和多模态不确定性，直接训练深度神经网络通常导致错误的语义颜色和低颜色丰富度。最近的基于转换器的方法可以提供更好的结果，但它们通常依赖于手动设计的先验知识，这是难以实现的，并且具有较差的泛化能力。此外，它们往往会引入严重的颜色渗透效果，因为颜色注意力是在单尺度特征上执行的，因此无法充分利用足够的语义信息。为了解决这些问题，我们提出了DDColor，一种通过双解码器进行图像上色的新的端到端方法。我们的方法包括多尺度图像解码器和基于转换器的颜色解码器。前者恢复图像的空间分辨率，而后者通过交叉注意力建立颜色和语义表示之间的关系。我们的两个解码器共同工作，利用多尺度图像特征来指导自适应颜色查询的优化，显著缓解了颜色渗透效应，而不是使用额外的先验知识。此外，还引入了简单但有效的彩度损失，以进一步增强生成结果的颜色丰富度。我们广泛的实验表明，DDColor在定量和定性上都比现有的最先进技术表现出更卓越的性能。代码将在 https://github.com/piddnad/DDColor 上公开。