Recent works in image inpainting have shown that structural information plays an important role in recovering visually pleasing results. In this paper, we propose an end-to-end architecture composed of two parallel UNet-based streams: a main stream (MS) and a structure stream (SS). With the assistance of SS, MS can produce plausible results with reasonable structures and realistic details. Specifically, MS reconstructs detailed images by inferring missing structures and textures simultaneously, and SS restores only missing structures by processing the hierarchical information from the encoder of MS. By interacting with SS in the training process, MS can be implicitly encouraged to exploit structural cues. In order to help SS focus on structures and prevent textures in MS from being affected, a gated unit is proposed to depress structure-irrelevant activations in the information flow between MS and SS. Furthermore, the multi-scale structure feature maps in SS are utilized to explicitly guide the structure-reasonable image reconstruction in the decoder of MS through the fusion block. Extensive experiments on CelebA, Paris StreetView and Places2 datasets demonstrate that our proposed method outperforms state-of-the-art methods.
翻译:近期的图像绘画工程显示,结构信息在恢复视觉上令人愉快的结果方面发挥着重要作用。 在本文中,我们提议由两个平行的UNet基流组成的端到端结构结构:一个主流(MS)和一个结构流(SS)。在SS的协助下,MS可以以合理的结构和现实的细节产生可信的结果。具体地说,MS通过同时推断缺失的结构和纹理来重建详细图像,SS只能通过处理MS编码器的分级信息来恢复缺失的结构。通过在培训过程中与SS进行互动,MS可以隐含地鼓励MS利用结构提示。为了帮助SS关注结构和防止MS的纹理受到影响,建议在MS和SS之间的信息流中建立一个封闭单元,以降低与结构相关的激活。此外,SS的多尺度结构特征图被用来明确指导MS解码的结构性合理图像重建,通过聚变块进行。在CeebA、巴黎街道View 和 Place2 数据集上进行的广泛实验,以证明我们拟议的方法超越了状态方法。