Class-conditional generative models are crucial tools for data generation from user-specified class labels. Existing approaches for class-conditional generative models require nontrivial modifications of backbone generative architectures to model conditional information fed into the model. This paper introduces a plug-and-play module named `multimodal controller' to generate multimodal data without introducing additional learning parameters. In the absence of the controllers, our model reduces to non-conditional generative models. We test the efficacy of multimodal controllers on CIFAR10, COIL100, and Omniglot benchmark datasets. We demonstrate that multimodal controlled generative models (including VAE, PixelCNN, Glow, and GAN) can generate class-conditional images of significantly better quality when compared with the state-of-the-art conditional generative models. Moreover, we show that multimodal controlled models can also create novel modalities of images.
翻译:类质基因模型是使用用户指定的分类标签生成数据的关键工具; 级质基因模型的现有方法要求对骨干基因结构进行非三重修改,以模拟输入模型的有条件信息; 本文介绍了一个称为“多式控制器”的插座和游戏模块,以生成多式数据,而不引入额外的学习参数; 在没有控制器的情况下,我们的模型将降低为不附带条件的基因模型; 我们在CIFAR10、COIL100和Omniglot基准数据集上测试多式联运控制器的功效; 我们证明,多式联运控制的基因模型(包括VAE、PixelCNN、Glow和GAN)能够产生质量显著提高的类别条件图像,与最先进的有条件基因模型相比。 此外,我们表明,多式控制模型还可以创建新的图像模式。