促进理解公平及其在集合机械学习中的构成 (Towards Understanding Fairness and its Composition in Ensemble Machine Learning)

Machine Learning (ML) software has been widely adopted in modern society, with reported fairness implications for minority groups based on race, sex, age, etc. Many recent works have proposed methods to measure and mitigate algorithmic bias in ML models. The existing approaches focus on single classifier-based ML models. However, real-world ML models are often composed of multiple independent or dependent learners in an ensemble (e.g., Random Forest), where the fairness composes in a non-trivial way. How does fairness compose in ensembles? What are the fairness impacts of the learners on the ultimate fairness of the ensemble? Can fair learners result in an unfair ensemble? Furthermore, studies have shown that hyperparameters influence the fairness of ML models. Ensemble hyperparameters are more complex since they affect how learners are combined in different categories of ensembles. Understanding the impact of ensemble hyperparameters on fairness will help programmers design fair ensembles. Today, we do not understand these fully for different ensemble algorithms. In this paper, we comprehensively study popular real-world ensembles: bagging, boosting, stacking and voting. We have developed a benchmark of 168 ensemble models collected from Kaggle on four popular fairness datasets. We use existing fairness metrics to understand the composition of fairness. Our results show that ensembles can be designed to be fairer without using mitigation techniques. We also identify the interplay between fairness composition and data characteristics to guide fair ensemble design. Finally, our benchmark can be leveraged for further research on fair ensembles. To the best of our knowledge, this is one of the first and largest studies on fairness composition in ensembles yet presented in the literature.

翻译：现代社会广泛采用机器学习(ML)软件,据报告,公平对基于种族、性别、年龄等的少数群体具有公平影响; 许多最近的著作都提出了衡量和减轻ML模型中算法偏差的方法。现有方法侧重于单一分类的ML模型。然而,现实世界ML模型通常由多种独立或依赖学习者组成,以混合方式(如随机森林)组成。了解数学超参数对公平的影响,将如何在编程中体现公平性? 学习者对合金最终公平性有何公平影响?公平性学生能否产生公平性影响?公平性学生能否产生不公平的共性? 此外,研究显示超参数影响以单项分类为基础的MLML模型的公平性。了解数学超参数对公平性的影响,将帮助编程设计公平性指南设计公平性指南; 当今,我们无法完全理解这些公平性学生对不同易懂的计算法的公平性构成的影响; 在纸质中,我们全面研究, 正在研究我们所编算的纸质数据库中,我们所编定的模型是真正的标准。我们正在收集的模型,我们所研订的模型的数据。