Point-interactive image colorization aims to colorize grayscale images when a user provides the colors for specific locations. It is essential for point-interactive colorization methods to appropriately propagate user-provided colors (i.e., user hints) in the entire image to obtain a reasonably colorized image with minimal user effort. However, existing approaches often produce partially colorized results due to the inefficient design of stacking convolutional layers to propagate hints to distant relevant regions. To address this problem, we present iColoriT, a novel point-interactive colorization Vision Transformer capable of propagating user hints to relevant regions, leveraging the global receptive field of Transformers. The self-attention mechanism of Transformers enables iColoriT to selectively colorize relevant regions with only a few local hints. Our approach colorizes images in real-time by utilizing pixel shuffling, an efficient upsampling technique that replaces the decoder architecture. Also, in order to mitigate the artifacts caused by pixel shuffling with large upsampling ratios, we present the local stabilizing layer. Extensive quantitative and qualitative results demonstrate that our approach highly outperforms existing methods for point-interactive colorization, producing accurately colorized images with a user's minimal effort.
翻译:点互动图像颜色化的目的是在用户为特定位置提供颜色时将灰色图像颜色化。 对于点互动色化方法以适当传播整个图像中用户提供的颜色(即用户提示)以获得合理颜色化图像以尽量少用用户的努力来获得合理色彩化图像至关重要。 但是,现有方法往往产生部分色彩化结果, 原因是堆叠卷动层的设计效率低下, 将提示传播到遥远的相关区域。 为了解决这个问题, 我们提供了iColoriT, 这是一个新的点互动色彩化视觉变异器, 能够向相关区域传播用户提示, 利用全球可接受变换器的字段。 变换器的自我注意机制使iColororiT能够以少量本地提示来选择性地对相关区域进行颜色化。 我们的方法通过使用像素的振动, 一种高效的放大扫描技术, 将图像实时化成彩色化。 另外, 为了减轻由像素冲动的放大比重比重比重生成的工艺, 我们展示了本地的颜色化图层。 精确的定量和定性图像化方法展示了我们现有的高度用户的色彩化方法。