In this paper, we present a novel approach to synthesize realistic images based on their semantic layouts. It hypothesizes that for objects with similar appearance, they share similar representation. Our method establishes dependencies between regions according to their appearance correlation, yielding both spatially variant and associated representations. Conditioning on these features, we propose a dynamic weighted network constructed by spatially conditional computation (with both convolution and normalization). More than preserving semantic distinctions, the given dynamic network strengthens semantic relevance, benefiting global structure and detail synthesis. We demonstrate that our method gives the compelling generation performance qualitatively and quantitatively with extensive experiments on benchmarks.
翻译:在本文中,我们提出了一种基于其语义布局综合现实图像的新办法。它假设对外观相似的物体来说,它们具有相似的表示力。我们的方法根据各区域的外观相关性确定它们之间的依赖性,产生空间变异和相关表示力。根据这些特点,我们建议采用一个动态加权网络,由空间条件计算(包括变异和正常化)来构建。除了保留语义区分外,给定的动态网络还加强了语义相关性,有利于全球结构和详细合成。我们证明,我们的方法在质量和数量上为有吸引力的一代提供了在质量和数量上的广泛基准实验。