Conditional generative modeling typically requires capturing one-to-many mappings between the inputs and outputs. However, vanilla conditional GANs (cGAN) tend to ignore the variations of the latent seeds which results in mode-collapse. As a solution, recent works have moved towards comparatively expensive models for generating diverse outputs in a conditional setting. In this paper, we argue that the limited diversity of the vanilla cGANs is not due to a lack of capacity, but a result of non-optimal training schemes. We tackle this problem from a geometrical perspective and propose a novel training mechanism that increases both the diversity and the visual quality of the vanilla cGAN. The proposed solution does not demand architectural modifications and paves the way for more efficient architectures that target conditional generation in multi-modal spaces. We validate the efficacy of our model against a diverse set of tasks and show that the proposed solution is generic and effective across multiple datasets.
翻译:有条件的基因建模通常需要捕获投入和产出之间的一对一绘图。 但是,香草有条件的GANs(cGAN)往往忽视导致模式折叠的潜在种子的变异。作为一种解决办法,最近的工程已经转向相对昂贵的模型,以便在有条件的环境中产生多种产出。在本文中,我们争辩说,香草CGAN的有限多样性并非由于缺乏能力,而是非最佳培训计划的结果。我们从几何角度处理这一问题,并提出一个新的培训机制,既增加香草CPAN的多样性,又增加其视觉质量。拟议解决方案并不要求建筑改造,而是为更有效的建筑结构铺平道路,目标是在多模式空间有条件的生成。我们对照一系列不同的任务来验证我们的模型的功效,并表明拟议的解决方案是通用的,在多个数据集之间是有效的。