Multi-attribute conditional image generation is a challenging problem in computervision. We propose Multi-attribute Pizza Generator (MPG), a conditional Generative Neural Network (GAN) framework for synthesizing images from a trichotomy of attributes: content, view-geometry, and implicit visual style. We design MPG by extending the state-of-the-art StyleGAN2, using a new conditioning technique that guides the intermediate feature maps to learn multi-scale multi-attribute entangled representationsof controlling attributes. Because of the complex nature of the multi-attribute image generation problem, we regularize the image generation by predicting the explicit conditioning attributes (ingredients and view). To synthesize a pizza image with view attributesoutside the range of natural training images, we design a CGI pizza dataset PizzaView using 3D pizza models and employ it to train a view attribute regressor to regularize the generation process, bridging the real and CGI training datasets. To verify the efficacy of MPG, we test it on Pizza10, a carefully annotated multi-ingredient pizza image dataset. MPG can successfully generate photo-realistic pizza images with desired ingredients and view attributes, beyond the range of those observed in real-world training data.
翻译:多归性有条件图像生成是计算机外观中的一个棘手问题。 我们提议多归性披萨生成器(MPG)是一个有条件的生成神经网络(GAN)框架(MPG),用于从属性的三组组合中合成图像:内容、视觉地理测量和隐含视觉风格。 我们设计了多归性披萨生成器(MPG),通过扩展最先进的StyleGAN2, 使用一种新的调节技术, 指导中期地貌地图学习多级多归性、相互交织的控制属性。 由于多归性图像生成问题的复杂性质, 我们通过预测清晰的调制属性( 编辑和视图) 来规范图像生成。 要在自然培训图像范围之外合成一个带有外观观的比萨图像, 我们用3D比萨模型设计了一个CGI Pizza数据集, 并使用它来训练一个显示属性回归器, 以规范生成过程, 连接真实和 CGIG培训数据集。 为了验证MPG的功效, 我们测试它是如何在比萨10, 一个仔细的多归性多归性披萨图像图像图像图像中, 在所观察到的磁带数据中成功生成。