Grayscale image colorization is a fascinating application of AI for information restoration. The inherently ill-posed nature of the problem makes it even more challenging since the outputs could be multi-modal. The learning-based methods currently in use produce acceptable results for straightforward cases but usually fail to restore the contextual information in the absence of clear figure-ground separation. Also, the images suffer from color bleeding and desaturated backgrounds since a single model trained on full image features is insufficient for learning the diverse data modes. To address these issues, we present a parallel GAN-based colorization framework. In our approach, each separately tailored GAN pipeline colorizes the foreground (using object-level features) or the background (using full-image features). The foreground pipeline employs a Residual-UNet with self-attention as its generator trained using the full-image features and the corresponding object-level features from the COCO dataset. The background pipeline relies on full-image features and additional training examples from the Places dataset. We design a DenseFuse-based fusion network to obtain the final colorized image by feature-based fusion of the parallelly generated outputs. We show the shortcomings of the non-perceptual evaluation metrics commonly used to assess multi-modal problems like image colorization and perform extensive performance evaluation of our framework using multiple perceptual metrics. Our approach outperforms most of the existing learning-based methods and produces results comparable to the state-of-the-art. Further, we performed a runtime analysis and obtained an average inference time of 24ms per image.
翻译:灰色图像色彩化是AI 用于信息恢复的令人着迷的应用。 问题本身的不正确性质使得它更具挑战性, 因为输出可以是多式的。 目前使用的基于学习的方法为简单案例产生可接受的结果, 但通常无法恢复背景信息, 因为没有清晰的图像地面区分。 此外, 图像会因颜色出血和不饱和背景而受到影响, 因为经过全面图像特征培训的单一模型不足以学习多种数据模式。 为了解决这些问题, 我们提出了一个平行的 GAN 色彩化框架。 在我们的方法中, 每个单独定制的 GAN 管道会将前景( 使用目标级别特性) 或背景( 使用全图像特征特征特征特征特征) 颜色化。 浅色管道会使用一个剩余 United- UNet 来恢复背景信息信息, 使用全图像特征特征特征特征特征特征和完整的目标级别特征, 使用我们当前图像模型的运行模式, 运行一个基于基于基于基于地基的图像的聚合网络, 通过基于基于地段的图像的当前图像模型的当前图像格式化分析, 运行一个我们当前图像模型的运行中运行的多式版本,, 运行中的业绩评估, 运行到我们使用一个基于基于我们当前图像模型的当前图像模型的运行式的运行的运行式的运行式的运行式的运行式的运行式的系统,, 运行式的系统,, 运行式的运行式的运行式的系统, 运行式的系统, 运行式的运行式的系统, 运行式的运行式的运行式系统, 运行式的运行式的运行式的运行式的系统,, 运行式的运行式的运行式的运行式的运行式的运行式的运行式的系统,,,, 运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的运行式的系统,,,,,,,像我们运行式的运行式的运行式的运行式的运行式的运行式的运行式