Foreground-aware image synthesis aims to generate images as well as their foreground masks. A common approach is to formulate an image as an masked blending of a foreground image and a background image. It is a challenging problem because it is prone to reach the trivial solution where either image overwhelms the other, i.e., the masks become completely full or empty, and the foreground and background are not meaningfully separated. We present FurryGAN with three key components: 1) imposing both the foreground image and the composite image to be realistic, 2) designing a mask as a combination of coarse and fine masks, and 3) guiding the generator by an auxiliary mask predictor in the discriminator. Our method produces realistic images with remarkably detailed alpha masks which cover hair, fur, and whiskers in a fully unsupervised manner.
翻译:视野图像合成旨在生成图像及其前景面罩。 一种共同的方法是将图像形成为表面图像和背景图像的蒙面混合面罩。 这是一个具有挑战性的问题,因为它容易达到微不足道的解决方案,即无论是图像还是图像都压倒了另一个图像,即面罩是完全满的还是空的,而前景和背景没有有意义地分开。 我们展示了FurryGAN, 有三个关键组成部分:(1) 将前景图像和复合图像都强加于人,2 将面罩设计成粗糙和精细面罩的组合,3 由导师的辅助面罩预测员引导生成器。 我们的方法产生真实的图像,其非常详细的字母面罩覆盖毛发、毛皮和长须,且以完全不受监督的方式覆盖。