Image generation has raised tremendous attention in both academic and industrial areas, especially for the conditional and target-oriented image generation, such as criminal portrait and fashion design. Although the current studies have achieved preliminary results along this direction, they always focus on class labels as the condition where spatial contents are randomly generated from latent vectors. Edge details are usually blurred since spatial information is difficult to preserve. In light of this, we propose a novel Spatially Constrained Generative Adversarial Network (SCGAN), which decouples the spatial constraints from the latent vector and makes these constraints feasible as additional controllable signals. To enhance the spatial controllability, a generator network is specially designed to take a semantic segmentation, a latent vector and an attribute-level label as inputs step by step. Besides, a segmentor network is constructed to impose spatial constraints on the generator. Experimentally, we provide both visual and quantitative results on CelebA and DeepFashion datasets, and demonstrate that the proposed SCGAN is very effective in controlling the spatial contents as well as generating high-quality images.
翻译:在学术和工业领域,特别是在有条件和面向目标的图像生成方面,如刑事肖像和时装设计,图像生成引起了极大关注。虽然目前的研究在这方面取得了初步结果,但始终侧重于类类标签,作为空间内容由潜在矢量随机生成的条件。由于空间信息难以保存,边缘细节通常模糊不清。鉴于这一点,我们提议建立一个新型的空间相控基因反转网络(SCGAN),它将潜在矢量的空间限制与空间矢量相分离,使这些制约因素作为额外的可控信号成为可行。为了提高空间可控性,一个发电机网络专门设计,将语义分解、潜在矢量和属性级标签作为一步输入。此外,一个分区网络的构建是为了对发电机施加空间限制。我们实验性地提供CeebA和DeepFashian数据集的视觉和定量结果,并表明拟议的SCGAN在控制空间内容以及生成高质量图像方面非常有效。