Fairness and robustness play vital roles in trustworthy machine learning. Observing safety-critical needs in various annotation-expensive vision applications, we introduce a novel learning framework, Fair Robust Active Learning (FRAL), generalizing conventional active learning to fair and adversarial robust scenarios. This framework allows us to achieve standard and robust minimax fairness with limited acquired labels. In FRAL, we then observe existing fairness-aware data selection strategies suffer from either ineffectiveness under severe data imbalance or inefficiency due to huge computations of adversarial training. To address these two problems, we develop a novel Joint INconsistency (JIN) method exploiting prediction inconsistencies between benign and adversarial inputs as well as between standard and robust models. These two inconsistencies can be used to identify potential fairness gains and data imbalance mitigations. Thus, by performing label acquisition with our inconsistency-based ranking metrics, we can alleviate the class imbalance issue and enhance minimax fairness with limited computation. Extensive experiments on diverse datasets and sensitive groups demonstrate that our method obtains the best results in standard and robust fairness under white-box PGD attacks compared with existing active data selection baselines.
翻译:公平和稳健性在可信赖的机器学习中发挥着关键作用。 观察各种说明-昂贵的视觉应用中的安全关键需求,我们引入了一种新的学习框架,即公平强力积极学习(FRAL),将常规积极学习推广到公平和对抗性强势情景中。这个框架使我们能够在获得的标签有限的情况下,实现标准和稳健的小型公平。在FRAL中,我们观察现有的公平数据选择战略,在数据严重失衡或由于大量计算对抗性培训而效率低下的情况下,要么是无效的。为了解决这两个问题,我们开发了一种新的联合不一致方法,利用良性投入和对抗性投入之间以及标准和稳健模型之间的预测不一致。这两个不一致之处可用于确定潜在的公平收益和数据不平衡的缓解措施。因此,通过利用基于不一致的排名指标获取标签,我们可以缓解阶级不平衡问题,并通过有限的计算提高微缩缩性公平性。对多种数据集和敏感群体进行的广泛实验表明,与现有的积极数据选择基线相比,我们在白箱PGD攻击下的标准和稳健的公平性攻击下取得最佳结果。