Point-interactive image colorization aims to colorize grayscale images when a user provides the colors for specific locations. It is essential for point-interactive colorization methods to appropriately propagate user-provided colors (i.e., user hints) in the entire image to obtain a reasonably colorized image with minimal user effort. However, existing approaches often produce partially colorized results due to the inefficient design of stacking convolutional layers to propagate hints to distant relevant regions. To address this problem, we present iColoriT, a novel point-interactive colorization Vision Transformer capable of propagating user hints to relevant regions, leveraging the global receptive field of Transformers. The self-attention mechanism of Transformers enables iColoriT to selectively colorize relevant regions with only a few local hints. Our approach colorizes images in real-time by utilizing pixel shuffling, an efficient upsampling technique that replaces the decoder architecture. Also, in order to mitigate the artifacts caused by pixel shuffling with large upsampling ratios, we present the local stabilizing layer. Extensive quantitative and qualitative results demonstrate that our approach highly outperforms existing methods for point-interactive colorization, producing accurately colorized images with a user's minimal effort. Official codes are available at https://pmh9960.github.io/research/iColoriT
翻译:点互动图像颜色化的目的是在用户为特定位置提供颜色时将灰色图像颜色化。 对于点互动色化方法以适当传播整个图像中用户提供的颜色(即用户提示)以获得合理色彩化图像以最小用户努力度获得合理色彩化图像至关重要。 但是,由于堆叠卷动层的设计效率低下,以向遥远的相关区域传播提示,现有方法往往产生部分色彩化结果。为了解决这个问题,我们提供了iColoriT,这是一个新的点互动色彩化视觉变异器,能够向相关区域传播用户提示,利用全球可接受变换器的字段。 变换器的自我注意机制使iColorieT能够以少量本地提示来选择性地将相关区域颜色化。 我们的方法通过使用像素的振动、高效的放大技术来实时将图像颜色化。 另外,为了减轻由像素与大振荡率比拼动的用户提示所引发的工艺品, 我们展示了本地的颜色化图层。 广度定量和定性的图像展示了我们现有的高级用户格式化方法。