Fair representation learning is an attractive approach that promises fairness of downstream predictors by encoding sensitive data. Unfortunately, recent work has shown that strong adversarial predictors can still exhibit unfairness by recovering sensitive attributes from these representations. In this work, we present Fair Normalizing Flows (FNF), a new approach offering more rigorous fairness guarantees for learned representations. Specifically, we consider a practical setting where we can estimate the probability density for sensitive groups. The key idea is to model the encoder as a normalizing flow trained to minimize the statistical distance between the latent representations of different groups. The main advantage of FNF is that its exact likelihood computation allows us to obtain guarantees on the maximum unfairness of any potentially adversarial downstream predictor. We experimentally demonstrate the effectiveness of FNF in enforcing various group fairness notions, as well as other attractive properties such as interpretability and transfer learning, on a variety of challenging real-world datasets.
翻译:公平代表制学习是一种通过编码敏感数据来保证下游预测者的公平性的有吸引力的方法。 不幸的是,最近的工作表明,强大的对立预测者仍然可以通过从这些代表中恢复敏感属性而表现出不公平。在这项工作中,我们介绍了公平标准化流程(FNF),这是为有学识的表述提供更严格公平保障的新方法。具体地说,我们考虑一个切实可行的环境,可以据此估计敏感群体的概率密度。关键的想法是将编码器模拟成一个正常流,受过训练,以尽量减少不同群体潜在代表之间的统计距离。FNF的主要优点是其准确可能性的计算使我们能够就任何潜在对立的下游预测者的最大不公平性获得保证。我们实验性地展示了FNF在各种具有挑战性的实际世界数据集上执行各种群体公平概念以及其他有吸引力的特性,如可解释性和转让性学习。