To avoid discriminatory uses of their data, organizations can learn to map them into a representation that filters out information related to sensitive attributes. However, all existing methods in fair representation learning generate a fairness-information trade-off. To achieve different points on the fairness-information plane, one must train different models. In this paper, we first demonstrate that fairness-information trade-offs are fully characterized by rate-distortion trade-offs. Then, we use this key result and propose SoFaiR, a single shot fair representation learning method that generates with one trained model many points on the fairness-information plane. Besides its computational saving, our single-shot approach is, to the extent of our knowledge, the first fair representation learning method that explains what information is affected by changes in the fairness / distortion properties of the representation. Empirically, we find on three datasets that SoFaiR achieves similar fairness-information trade-offs as its multi-shot counterparts.
翻译:为了避免数据被歧视性地使用,各组织可以学会将数据映射成一个过滤与敏感属性有关的信息的演示体,然而,公平代表性学习的所有现有方法都会产生公平信息权衡。为了在公平信息平面上达到不同点,我们必须培训不同的模型。在本文中,我们首先表明公平信息权衡完全以扭曲率权衡取舍为特征。然后,我们利用这一关键结果,并提议一个单一的公平代表性学习方法SoFaiR,这是一个单一的、以经过培训的模型在公平信息平面上生成的公平信息交换取舍方法。除了计算节省外,我们的单手方法是,在我们的知识范围内,第一个公平代表性学习方法是解释信息受到公平/扭曲代表性特性变化的影响。我们很自然地发现,SFaiR在三个数据集上取得了与多镜头对应方相似的公平信息权衡取舍。