Recently, fairness-aware learning have become increasingly crucial, but we note that most of those methods operate by assuming the availability of fully annotated group-labels. We emphasize that such assumption is unrealistic for real-world applications since group label annotations are expensive and can conflict with privacy issues. In this paper, we consider a more practical scenario, dubbed as Algorithmic Fairness with the Partially annotated Group labels (Fair-PG). We observe that the existing fairness methods, which only use the data with group-labels, perform even worse than the vanilla training, which simply uses full data only with target labels, under Fair-PG. To address this problem, we propose a simple Confidence-based Group Label assignment (CGL) strategy that is readily applicable to any fairness-aware learning method. Our CGL utilizes an auxiliary group classifier to assign pseudo group labels, where random labels are assigned to low confident samples. We first theoretically show that our method design is better than the vanilla pseudo-labeling strategy in terms of fairness criteria. Then, we empirically show for UTKFace, CelebA and COMPAS datasets that by combining CGL and the state-of-the-art fairness-aware in-processing methods, the target accuracies and the fairness metrics are jointly improved compared to the baseline methods. Furthermore, we convincingly show that our CGL enables to naturally augment the given group-labeled dataset with external datasets only with target labels so that both accuracy and fairness metrics can be improved. We will release our implementation publicly to make future research reproduce our results.
翻译:最近,公平意识学习变得日益重要,但我们注意到,大多数这些方法都通过假设完全配有附加说明的团体标签来运作。我们强调,这种假设对于现实世界应用程序来说是不切实际的,因为集团标签说明费用昂贵,而且可能与隐私问题发生冲突。在本文中,我们考虑一种更实际的设想,称为 " 有部分附加说明的团体标签(Fair-PG)的 " 感冒公平 " 。我们观察到,现有的公平方法仅使用群体标签数据,其执行比香草培训更加公正,而香草培训仅使用目标标签的完整数据。为了解决这一问题,我们建议一种简单的基于信任的集团 Label 任务(CGLL) 战略,这个战略很容易适用于任何有公平意识的学习方法。 我们的CGLL使用一个辅助组分类来分配假的团体标签,其中随机标签被分配到低信心的样本。我们从理论上看,我们的方法设计比香草类假标签的改进战略要好,在公平标准方面,我们只需使用目标标签只使用目标标签。然后,我们实验性地表明,我们比Caface-Face、CalebA和Calebas-alalalal-la 将我们的数据比Cal-al-al-al-al-al-al-al-laxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,我们使用比C-cxxxxxxxxxxx,我们的外部数据比C-al-al-al-al-cxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx