How to generate conditional synthetic data for a domain without utilizing information about its labels/attributes? Our work presents a solution to the above question. We propose a transfer learning-based framework utilizing normalizing flows, coupled with both maximum-likelihood and adversarial training. We model a source domain (labels available) and a target domain (labels unavailable) with individual normalizing flows, and perform domain alignment to a common latent space using adversarial discriminators. Due to the invertible property of flow models, the mapping has exact cycle consistency. We also learn the joint distribution of the data samples and attributes in the source domain by employing an encoder to map attributes to the latent space via adversarial training. During the synthesis phase, given any combination of attributes, our method can generate synthetic samples conditioned on them in the target domain. Empirical studies confirm the effectiveness of our method on benchmarked datasets. We envision our method to be particularly useful for synthetic data generation in label-scarce systems by generating non-trivial augmentations via attribute transformations. These synthetic samples will introduce more entropy into the label-scarce domain than their geometric and photometric transformation counterparts, helpful for robust downstream tasks.
翻译:如何在不使用其标签/属性信息的情况下为某一域生成有条件的合成数据? 我们的工作为上述问题提供了一个解决方案。 我们提议一个基于学习的转移框架,利用正常流,同时提供最大相似性和对抗性培训。 我们用单个正常流来模拟一个源域(标签可用)和目标域(标签不可用),并使用对抗性偏差器对一个共同的潜在空间进行域对齐。 由于流动模型的不可逆属性,绘图具有准确的周期一致性。 我们还通过对抗性培训使用编码器将数据样品和属性联合分布在源域。 在合成阶段,考虑到各种属性的组合,我们的方法可以产生以目标领域为条件的合成样品。 精神研究证实了我们的基准数据集方法的有效性。 我们设想了我们的方法,通过通过属性转换生成非三重增强值数据,在标签-卡列系统中特别有助于合成数据生成。 这些合成样品将在标签-卡列域内引入比其稳健的地面测量和测光学对等任务更具有帮助性。