Generative image models have been extensively studied in recent years. In the unconditional setting, they model the marginal distribution from unlabelled images. To allow for more control, image synthesis can be conditioned on semantic segmentation maps that instruct the generator the position of objects in the image. While these two tasks are intimately related, they are generally studied in isolation. We propose OCO-GAN, for Optionally COnditioned GAN, which addresses both tasks in a unified manner, with a shared image synthesis network that can be conditioned either on semantic maps or directly on latents. Trained adversarially in an end-to-end approach with a shared discriminator, we are able to leverage the synergy between both tasks. We experiment with Cityscapes, COCO-Stuff, ADE20K datasets in a limited data, semi-supervised and full data regime and obtain excellent performance, improving over existing hybrid models that can generate both with and without conditioning in all settings. Moreover, our results are competitive or better than state-of-the art specialised unconditional and conditional models.
翻译:近年来,对生成图像模型进行了广泛的研究。在无条件设置中,它们用未贴标签图像的边际分布模型进行模拟。为了进行更多的控制,图像合成可以以显示图像中物体位置的语义分割图为条件。虽然这两项任务密切相关,但一般都是孤立地研究的。我们建议OCO-GAN, 用于可选组合的GAN, 以统一的方式处理这两项任务, 共同的图像合成网络可以以语义地图或直接以潜伏为条件。 与共同的制导者一道, 以端对端方式对端培训图像合成, 能够利用这两种任务之间的协同作用。 我们用有限的数据实验, 半监督和完整的数据系统, 并获得良好的性能, 改进现有的混合模型, 既可以生成,也可以不设任何环境的调节。 此外, 我们的结果比状态专用的、 无条件和有条件的专用模型具有竞争力或更好。