改善不再培训的深创模型的公平性 (Improving the Fairness of Deep Generative Models without Retraining)

Generative Adversarial Networks (GANs) advance face synthesis through learning the underlying distribution of observed data. Despite the high-quality generated faces, some minority groups can be rarely generated from the trained models due to a biased image generation process. To study the issue, we first conduct an empirical study on a pre-trained face synthesis model. We observe that after training the GAN model not only carries the biases in the training data but also amplifies them to some degree in the image generation process. To further improve the fairness of image generation, we propose an interpretable baseline method to balance the output facial attributes without retraining. The proposed method shifts the interpretable semantic distribution in the latent space for a more balanced image generation while preserving the sample diversity. Besides producing more balanced data regarding a particular attribute (e.g., race, gender, etc.), our method is generalizable to handle more than one attribute at a time and synthesize samples of fine-grained subgroups. We further show the positive applicability of the balanced data sampled from GANs to quantify the biases in other face recognition systems, like commercial face attribute classifiers and face super-resolution algorithms.

翻译：通过学习观察数据的基本分布,使Adversarial Networks(GANs)提前进行面部合成。尽管产生了高质量的面部质量,但一些少数群体很少能够从经过训练的模型中产生,因为生成图像的过程偏颇。为了研究这一问题,我们首先对经过训练的面部合成模型进行经验研究。我们注意到,在培训GAN模型之后,不仅含有培训数据的偏差,而且还在一定程度上放大了这些模型。为了进一步提高图像生成的公平性,我们提出了一个可解释的基准方法,以便在不进行再培训的情况下平衡输出面部属性。拟议方法将潜在空间的可解释的语义分布改变为更平衡的图像生成,同时保护样本多样性。除了产生关于特定属性(例如种族、性别等)的更均衡的数据外,我们的方法可以广泛处理一个以上的属性,并合成精细子分组的样本。我们进一步表明,从GANs采集的均衡数据对于量化其他面部识别系统中的偏差,例如商业面部位分类器和面部位超级分辨率算法的正确适用性。