Colorizing a given gray-level image is an important task in the media and advertising industry. Due to the ambiguity inherent to colorization (many shades are often plausible), recent approaches started to explicitly model diversity. However, one of the most obvious artifacts, structural inconsistency, is rarely considered by existing methods which predict chrominance independently for every pixel. To address this issue, we develop a conditional random field based variational auto-encoder formulation which is able to achieve diversity while taking into account structural consistency. Moreover, we introduce a controllability mecha- nism that can incorporate external constraints from diverse sources in- cluding a user interface. Compared to existing baselines, we demonstrate that our method obtains more diverse and globally consistent coloriza- tions on the LFW, LSUN-Church and ILSVRC-2015 datasets.
翻译:在媒体和广告业中,一个灰色图像的颜色是一个重要的任务。由于彩色化的内在模糊性(许多遮光镜往往看似可信),最近的做法开始明确地模拟多样性。然而,最明显的文物之一,即结构性不一致,很少被独立预测每个像素色度的现有方法所考虑。为了解决这一问题,我们开发了一种有条件的、随机的基于外地的变异自动编码配方,既能实现多样性,又能考虑到结构一致性。此外,我们引入了一种可控性机械-共振,可以将不同来源的外部限制纳入到一个用户界面中。与现有的基线相比,我们证明我们的方法在LFW、LSUN-Church和ILSVRC-2015数据集上获得了更加多样和一致的颜色。