With an increased focus on incorporating fairness in machine learning models, it becomes imperative not only to assess and mitigate bias at each stage of the machine learning pipeline but also to understand the downstream impacts of bias across stages. Here we consider a general, but realistic, scenario in which a predictive model is learned from (potentially biased) training data, and model predictions are assessed post-hoc for fairness by some auditing method. We provide a theoretical analysis of how a specific form of data bias, differential sampling bias, propagates from the data stage to the prediction stage. Unlike prior work, we evaluate the downstream impacts of data biases quantitatively rather than qualitatively and prove theoretical guarantees for detection. Under reasonable assumptions, we quantify how the amount of bias in the model predictions varies as a function of the amount of differential sampling bias in the data, and at what point this bias becomes provably detectable by the auditor. Through experiments on two criminal justice datasets -- the well-known COMPAS dataset and historical data from NYPD's stop and frisk policy -- we demonstrate that the theoretical results hold in practice even when our assumptions are relaxed.
翻译:随着更加重视将公平纳入机器学习模式,不仅必须评估和减轻机器学习管道每个阶段的偏见,而且必须了解各阶段偏见的下游影响。这里我们考虑一种一般但现实的设想,即从(可能偏差的)培训数据中学习预测模型,模型预测在选择后被某些审计方法评估,以便公平。我们从理论上分析数据偏差的具体形式、不同抽样偏差、从数据阶段到预测阶段的传播。与以前的工作不同,我们从数量上而不是质量上评价数据偏差的下游影响,并证明对探测的理论保证。根据合理的假设,我们量化模型预测中的偏差数量如何因数据差异抽样偏差而变化,以及审计员在什么程度上可以明显地发现这种偏差。我们通过对两个刑事司法数据集 -- -- 众所周知的COMAS数据集和来自NYPD的停止和风险政策的历史数据 -- -- 的实验,我们证明理论结果在实践中,即使我们的假设是宽松的。