Machine Learning models have been deployed across many different aspects of society, often in situations that affect social welfare. Although these models offer streamlined solutions to large problems, they may contain biases and treat groups or individuals unfairly based on protected attributes such as gender. In this paper, we introduce several examples of machine learning gender bias in practice followed by formalizations of fairness. We provide a survey of fairness research by detailing influential pre-processing, in-processing, and post-processing bias mitigation algorithms. We then propose an end-to-end bias mitigation framework, which employs a fusion of pre-, in-, and post-processing methods to leverage the strengths of each individual technique. We test this method, along with the standard techniques we review, on a deep neural network to analyze bias mitigation in a deep learning setting. We find that our end-to-end bias mitigation framework outperforms the baselines with respect to several fairness metrics, suggesting its promise as a method for improving fairness. As society increasingly relies on artificial intelligence to help in decision-making, addressing gender biases present in deep learning models is imperative. To provide readers with the tools to assess the fairness of machine learning models and mitigate the biases present in them, we discuss multiple open source packages for fairness in AI.
翻译:虽然这些模型为大规模问题提供了简化的解决办法,但可能含有偏见,并不公平地对待基于诸如性别等受保护属性的群体或个人。我们在本文件中介绍了几个在实践过程中学习机器性别偏见的实例,随后正式地将公平化。我们通过详细介绍有影响力的预处理、处理和处理后减少偏向的算法,对公平研究进行了调查。然后我们提出了一个端到端的减少偏差框架,利用各种预、内和后处理方法来利用每一种个别技术的优势。我们连同我们审查的标准技术一起,在深层的神经网络上测试这一方法,以分析在深层学习环境中减少偏见的深层神经网络。我们发现,我们的端到端减少偏差框架超越了若干公平度指标的基线,提出了改善公平性的承诺。随着社会日益依赖人工智能来帮助决策,解决深层学习模式中存在的性别偏见问题,我们必须为读者提供工具,评估机器学习模型的公平性,并减少目前各种偏见的来源。我们用多种方法来评估机器学习模型的公平性。