Object removal and image inpainting in facial images is a task in which objects that occlude a facial image are specifically targeted, removed, and replaced by a properly reconstructed facial image. Two different approaches utilizing U-net and modulated generator respectively have been widely endorsed for this task for their unique advantages but notwithstanding each method's innate disadvantages. U-net, a conventional approach for conditional GANs, retains fine details of unmasked regions but the style of the reconstructed image is inconsistent with the rest of the original image and only works robustly when the size of the occluding object is small enough. In contrast, the modulated generative approach can deal with a larger occluded area in an image and provides {a} more consistent style, yet it usually misses out on most of the detailed features. This trade-off between these two models necessitates an invention of a model that can be applied to any size of mask while maintaining a consistent style and preserving minute details of facial features. Here, we propose Semantics-Guided Inpainting Network (SGIN) which itself is a modification of the modulated generator, aiming to take advantage of its advanced generative capability and preserve the high-fidelity details of the original image. By using the guidance of a semantic map, our model is capable of manipulating facial features which grants direction to the one-to-many problem for further practicability.
翻译:面部图像中的物体除去和图像涂抹面部图像是一项任务,其中隐藏面部图像的对象被具体锁定、清除,并被一个经过适当重建的面部图像所取代。 与此相反,两种分别使用Unet和调制生成器的不同方法因其独特的优势而被广泛认可用于这项任务,但尽管每种方法都有其固有的缺点。 U-net是有条件GANs的一个常规方法,它保留了无面部区域的详细细节,但重建后的图像的风格与原始图像的其余部分不符,并且只有在隐形物体的大小足够小时才能强有力地发挥作用。 相比之下,调制的基因化方法可以处理一个更大的隐蔽区域,提供更一致的风格,但通常会忽略大多数详细特征。 这两种模型之间的这种交换需要发明一种模型,可以适用于任何尺寸的面部,同时保持一个一致的样式和保存面部特征的微小细节。 我们在这里提议“ mantic-Guid Inpainting Network” (SGIN) 。 其调化方法本身是利用一个高额的图像导形结构的变形性能力,用来保护其原变形结构的模型的模型,从而保护其原变形结构的机的机型结构的优势。