Addressing fairness concerns about machine learning models is a crucial step towards their long-term adoption in real-world automated systems. While many approaches have been developed for training fair models from data, little is known about the robustness of these methods to data corruption. In this work we consider fairness-aware learning under worst-case data manipulations. We show that an adversary can in some situations force any learner to return an overly biased classifier, regardless of the sample size and with or without degrading accuracy, and that the strength of the excess bias increases for learning problems with underrepresented protected groups in the data. We also prove that our hardness results are tight up to constant factors. To this end, we study two natural learning algorithms that optimize for both accuracy and fairness and show that these algorithms enjoy guarantees that are order-optimal in terms of the corruption ratio and the protected groups frequencies in the large data limit.
翻译:解决对机器学习模式的公平问题,是朝着在现实世界自动化系统中长期采用这种模式迈出的关键一步。虽然已经为从数据中培训公平模型制定了许多方法,但对于这些方法对数据腐败的稳健性却知之甚少。在这项工作中,我们考虑在最差的数据操纵下进行公平意识学习。我们表明,在某些情况下,对手可以迫使任何学习者返回一个过于偏颇的分类,而不论其抽样大小和是否准确性,而且过度偏差的强度会增加数据中受保护群体学习问题的学习问题。我们还证明,我们硬性的结果与不变因素密切相关。为此,我们研究两种自然学习算法,既能优化准确性,又能公平性,并表明这些算法享有从腐败比率和大数据限制中的受保护群体频率来看是有序最佳的保证。