There is a well known intrinsic trade-off between the fairness of a representation and the performance of classifiers derived from the representation. Due to the complexity of optimisation algorithms in most modern representation learning approaches, for a given method it may be non-trivial to decide whether the obtained fairness-performance curve of the method is optimal, i.e., whether it is close to the true Pareto front for these quantities for the underlying data distribution. In this paper we propose a new method to compute the optimal Pareto front, which does not require the training of complex representation models. We show that optimal fair representations possess several useful structural properties, and that these properties enable a reduction of the computation of the Pareto Front to a compact discrete problem. We then also show that these compact approximating problems can be efficiently solved via off-the shelf concave-convex programming methods. Since our approach is independent of the specific model of representations, it may be used as the benchmark to which representation learning algorithms may be compared. We experimentally evaluate the approach on a number of real world benchmark datasets.
翻译:表征的公平性与基于该表征构建的分类器性能之间存在众所周知的固有权衡。由于大多数现代表征学习方法中优化算法的复杂性,对于给定方法而言,判断其获得的公平性-性能曲线是否最优(即是否接近底层数据分布中这些量的真实帕累托前沿)可能并非易事。本文提出一种计算最优帕累托前沿的新方法,该方法无需训练复杂的表征模型。我们证明最优公平表征具有若干有用的结构特性,这些特性使得帕累托前沿的计算可简化为紧凑的离散问题。进一步论证这些紧凑逼近问题可通过现成的凹凸规划方法高效求解。由于本方法与具体表征模型无关,可作为评估表征学习算法的基准参照。我们在多个真实世界基准数据集上对所提方法进行了实验评估。