Deep Generative Networks (DGNs) are extensively employed in Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and their variants to approximate the data manifold, and data distribution on that manifold. However, training samples are often obtained based on preferences, costs, or convenience producing artifacts in the empirical data distribution e.g., the large fraction of smiling faces in the CelebA dataset or the large fraction of dark-haired individuals in FFHQ. These inconsistencies will be reproduced when sampling from the trained DGN, which has far-reaching potential implications for fairness, data augmentation, anomaly detection, domain adaptation, and beyond. In response, we develop a differential geometry based sampler -- coined MaGNET -- that, given any trained DGN, produces samples that are uniformly distributed on the learned manifold. We prove theoretically and empirically that our technique produces a uniform distribution on the manifold regardless of the training set distribution. We perform a range of experiments on various datasets and DGNs. One of them considers the state-of-the-art StyleGAN2 trained on FFHQ dataset, where uniform sampling via MaGNET increases distribution precision and recall by 4.1% & 3.0% and decreases gender bias by 41.2%, without requiring labels or retraining.
翻译:深基因网络(DGN)被广泛用于General Adversarial网络(GANs)和VAE(VAEs),以及它们用来估计数据元数及其数据分布的变体。然而,培训样品往往根据偏好、成本或方便性获得,在经验数据分布中产生艺术品,例如CelebA数据集中大量笑脸或FFHQ中大量深色头发人。当从受过训练的DGN进行取样时,将复制这些不一致之处,这些抽样对公平、数据增强、异常检测、域适应及以后具有深远的潜在影响。作为回应,我们开发了基于数据元数的差别地理测量取样器(Coided MaGNET),根据受过训练的DGNG在经验分布上统一分布。我们从理论上和实验上证明,我们的技术在各种数据集和DGNS上都产生统一分布。我们进行了一系列关于各种数据集和DGN的实验。其中一项实验认为,基于标准SyleGAN2的SyleGAN2 和精确度,需要通过MAAFFQSeral% Bregresmal的精确度分配。