Generative adversarial networks (GANs) are one of the greatest advances in AI in recent years. With their ability to directly learn the probability distribution of data, and then sample synthetic realistic data. Many applications have emerged, using GANs to solve classical problems in machine learning, such as data augmentation, class unbalance problems, and fair representation learning. In this paper, we analyze and highlight fairness concerns of GANs model. In this regard, we show empirically that GANs models may inherently prefer certain groups during the training process and therefore they're not able to homogeneously generate data from different groups during the testing phase. Furthermore, we propose solutions to solve this issue by conditioning the GAN model towards samples' group or using ensemble method (boosting) to allow the GAN model to leverage distributed structure of data during the training phase and generate groups at equal rate during the testing phase.
翻译:生成对抗性网络(GANs)是近年来AI(AI)中最大的进步之一。 它们能够直接学习数据的概率分布,然后抽样合成现实数据。 许多应用已经出现,利用GANs解决机器学习的经典问题,如数据增强、阶级不平衡问题和公平代表性学习。 在本文中,我们分析并强调了GANs模型的公平问题。 在这方面,我们从经验上表明,GANs模型在培训过程中可能从本质上偏爱某些群体,因此无法在测试阶段从不同群体中统一生成数据。 此外,我们提出了解决这一问题的解决方案,将GAN模型调整为样本群体,或者使用混合方法(加速)使GAN模型在培训阶段能够利用分布的数据结构,并在测试阶段以同样的速度生成群体。