Deep generative models have made much progress in improving training stability and quality of generated data. Recently there has been increased interest in the fairness of deep-generated data. Fairness is important in many applications, e.g. law enforcement, as biases will affect efficacy. Central to fair data generation are the fairness metrics for the assessment and evaluation of different generative models. In this paper, we first review fairness metrics proposed in previous works and highlight potential weaknesses. We then discuss a performance benchmark framework along with the assessment of alternative metrics.
翻译:深层基因模型在提高培训稳定性和生成数据质量方面取得了很大进展,最近对深层生成数据的公平性越来越感兴趣,公平性在许多应用中很重要,例如执法,因为偏见会影响效力;公平数据生成的核心是评估和评价不同基因模型的公平性衡量标准;在本文件中,我们首先审查以往工作中提出的公平性衡量标准,并强调潜在的弱点;然后讨论业绩基准框架,同时评估替代指标。