Class-conditional generative models are crucial tools for data generation from user-specified class labels. Existing approaches for class-conditional generative models require nontrivial modifications of backbone generative architectures to model conditional information fed into the model. This paper introduces a plug-and-play module named `multimodal controller' to generate multimodal data without introducing additional learning parameters. In the absence of the controllers, our model reduces to non-conditional generative models. We test the efficacy of multimodal controllers on CIFAR10, COIL100, and Omniglot benchmark datasets. We demonstrate that multimodal controlled generative models (including VAE, PixelCNN, Glow, and GAN) can generate class-conditional images of significantly better quality when compared with conditional generative models. Moreover, we show that multimodal controlled models can also create novel modalities of images.
翻译:分类质变模型是用用户指定的分类标签生成数据的关键工具; 分类质变模型的现有方法要求对骨干基因结构进行非三重修改,以模拟输入模型的有条件信息; 本文介绍了一个称为“多式控制器”的插座和游戏模块,以生成多式数据,而不引入额外的学习参数; 在没有控制器的情况下,我们的模型将降低为不附带条件的基因化模型; 我们在CIFAR10、COIL100和Omniglot基准数据集上测试多式联运控制器的功效; 我们证明多式联运控制的基因化模型(包括VAE、PixelCNN、Glow和GAN)能够产生质量显著提高的类别质变图像; 此外, 我们表明,多式控制模型还可以创建新的图像模式。