Bias in machine learning has rightly received significant attention over the last decade. However, most fair machine learning (fair-ML) work to address bias in decision-making systems has focused solely on the offline setting. Despite the wide prevalence of online systems in the real world, work on identifying and correcting bias in the online setting is severely lacking. The unique challenges of the online environment make addressing bias more difficult than in the offline setting. First, Streaming Machine Learning (SML) algorithms must deal with the constantly evolving real-time data stream. Second, they need to adapt to changing data distributions (concept drift) to make accurate predictions on new incoming data. Adding fairness constraints to this already complicated task is not straightforward. In this work, we focus on the challenges of achieving fairness in biased data streams while accounting for the presence of concept drift, accessing one sample at a time. We present Fair Sampling over Stream ($FS^2$), a novel fair rebalancing approach capable of being integrated with SML classification algorithms. Furthermore, we devise the first unified performance-fairness metric, Fairness Bonded Utility (FBU), to evaluate and compare the trade-off between performance and fairness of different bias mitigation methods efficiently. FBU simplifies the comparison of fairness-performance trade-offs of multiple techniques through one unified and intuitive evaluation, allowing model designers to easily choose a technique. Overall, extensive evaluations show our measures surpass those of other fair online techniques previously reported in the literature.
翻译:在过去十年中,机器学习中的偏见理所当然地受到高度重视。然而,大多数处理决策系统中偏见的公平机器学习(公平ML)工作完全集中在离线设置上。尽管在线系统在现实世界中广泛存在,但查明和纠正在线设置中的偏见的工作严重缺乏。在线环境的独特挑战使得解决偏见问题比脱线设置更困难。首先,流动机器学习(SML)算法必须处理不断演变的实时数据流。第二,它们需要适应不断变化的数据分配(概念流)工作,以便准确预测新收到的数据。给这一已经十分复杂的任务增加公平性限制并非直截了当。在这项工作中,我们侧重于在考虑存在概念流的同时实现偏向性数据流的公平性挑战,同时访问一个样本。我们介绍对Stream (FS%2美元)的公平抽样,这是一种新的公平平衡模式,能够与SML分类算法相结合。此外,我们设计了第一个统一的绩效公平度衡量标准、公平度保证Utaility (FBU) 的公平性评估方法,并用不同的业绩评估方法,通过以前的公平性评估方法来评估。