The purpose of image inpainting is to recover scratches and damaged areas using context information from remaining parts. In recent years, thanks to the resurgence of convolutional neural networks (CNNs), image inpainting task has made great breakthroughs. However, most of the work consider insufficient types of mask, and their performance will drop dramatically when encountering unseen masks. To combat these challenges, we propose a simple yet general method to solve this problem based on the LaMa image inpainting framework, dubbed GLaMa. Our proposed GLaMa can better capture different types of missing information by using more types of masks. By incorporating more degraded images in the training phase, we can expect to enhance the robustness of the model with respect to various masks. In order to yield more reasonable results, we further introduce a frequency-based loss in addition to the traditional spatial reconstruction loss and adversarial loss. In particular, we introduce an effective reconstruction loss both in the spatial and frequency domain to reduce the chessboard effect and ripples in the reconstructed image. Extensive experiments demonstrate that our method can boost the performance over the original LaMa method for each type of mask on FFHQ, ImageNet, Places2 and WikiArt dataset. The proposed GLaMa was ranked first in terms of PSNR, LPIPS and SSIM in the NTIRE 2022 Image Inpainting Challenge Track 1 Unsupervised.
翻译:图像涂鸦的目的是利用剩余部分的背景资料来恢复刮痕和受损害地区。近年来,由于神经神经网络卷土重来,图像涂漆任务取得了重大突破。然而,大多数工作都考虑到面具类型不够,在遇到隐形面具时,其性能将急剧下降。为了应对这些挑战,我们提出了一个简单而笼统的方法,根据拉马涂漆框架(称为GlaMa)解决这个问题。我们提议的GLaMa可以使用更多种类的面具更好地捕捉不同种类的缺失信息。通过在培训阶段纳入更退化的图像,我们可以期望在各种面具方面加强模型的稳健性。为了产生更合理的结果,我们除了传统的空间重建损失和对抗性损失之外,还进一步引入基于频率的损失。特别是,我们在空间和频率领域引入了有效的重建损失,以减少棋盘效应和再版图像的波纹。广泛的实验表明,我们的方法可以提高每个类型图像网络的原始拉马方法的性能。