Recent advances in image inpainting have shown impressive results for generating plausible visual details on rather simple backgrounds. However, for complex scenes, it is still challenging to restore reasonable contents as the contextual information within the missing regions tends to be ambiguous. To tackle this problem, we introduce pretext tasks that are semantically meaningful to estimating the missing contents. In particular, we perform knowledge distillation on pretext models and adapt the features to image inpainting. The learned semantic priors ought to be partially invariant between the high-level pretext task and low-level image inpainting, which not only help to understand the global context but also provide structural guidance for the restoration of local textures. Based on the semantic priors, we further propose a context-aware image inpainting model, which adaptively integrates global semantics and local features in a unified image generator. The semantic learner and the image generator are trained in an end-to-end manner. We name the model SPL to highlight its ability to learn and leverage semantic priors. It achieves the state of the art on Places2, CelebA, and Paris StreetView datasets.
翻译:图像绘画的最近进展显示了令人印象深刻的结果,在相当简单的背景中产生了可信的视觉细节。然而,对于复杂的场景,恢复合理内容仍具有挑战性,因为缺失区域内的背景信息往往模糊不清。为了解决这一问题,我们引入了具有语义意义的借口任务,以估计缺失的内容。特别是,我们以借口模型进行知识蒸馏,并根据图像绘画的特征对特征进行调整。所学的语义学前言应当部分地在高层次借口任务和低层次图像绘画之间互不相容,这不仅有助于理解全球背景,而且还为恢复本地纹理提供结构性指导。根据语义前言,我们进一步提议了一种符合语义的图像绘画模型,该模型将全球语义学和地方特征适应性地整合到统一的图像生成器中。语义学和图像生成器以端到端的方式培训。我们命名了SPL模型,以突出其学习和利用语义前言的能力。它实现了Street2号、CelibA和巴黎数据。