Image inpainting is the task of filling masked or unknown regions of an image with visually realistic contents, which has been remarkably improved by Deep Neural Networks (DNNs) recently. Essentially, as an inverse problem, the inpainting has the underlying challenges of reconstructing semantically coherent results without texture artifacts. Many previous efforts have been made via exploiting attention mechanisms and prior knowledge, such as edges and semantic segmentation. However, these works are still limited in practice by an avalanche of learnable prior parameters and prohibitive computational burden. To this end, we propose a novel model -- Wavelet prior attention learning in Axial Inpainting Network (WAIN), whose generator contains the encoder, decoder, as well as two key components of Wavelet image Prior Attention (WPA) and stacked multi-layer Axial-Transformers (ATs). Particularly, the WPA guides the high-level feature aggregation in the multi-scale frequency domain, alleviating the textual artifacts. Stacked ATs employ unmasked clues to help model reasonable features along with low-level features of horizontal and vertical axes, improving the semantic coherence. Extensive quantitative and qualitative experiments on Celeba-HQ and Places2 datasets are conducted to validate that our WAIN can achieve state-of-the-art performance over the competitors. The codes and models will be released.
翻译:映射中的图像是用视觉现实内容填补遮蔽或未知图像区域的任务,这种任务最近由深神经网络(DNNS)显著改进。基本上,作为一个反面问题,油漆具有在没有纹理工艺品的情况下重建语义一致性结果的根本挑战。许多以前的努力是通过利用注意力机制和先前知识,如边缘和语义分割等,在实际中,这些工作仍然有限,因为对以前可学习的参数和令人望而生畏的计算负担进行了估价。为此,我们提出了一个新颖模型 -- -- 在Axial Inpaint网络(WAIN)中,Wavelet先前的注意力学习,其产生者包括编码器、解码器,以及Wavelet图像先前注意和堆放多层Axial-变异体(ATs)的两个关键组成部分。尤其是,WPA将指导多频域的高级特征汇总,并减轻文字制品的负担。 Stacked ATs使用不显眼的线索,帮助模型在Axial Q-C-C-C-C-C-C-C-C-C-C-C-Sal-Sal-C-C-Sal-Sal-C-C-C-Sild-Syal-C-C-Sild-Sildal-Syal-C-Syal-Syal-C-Syal-C-Sy-Syal-Syal-Syal-C-C-C-C-C-C-C-C-Sy-C-C-C-C-C-C-C-C-C-C-C-C-C-S-S-S-S-Sal-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S