Systems that offer continuous model monitoring have emerged in response to (1) well-documented failures of deployed Machine Learning (ML) and Artificial Intelligence (AI) models and (2) new regulatory requirements impacting these models. Existing monitoring systems continuously track the performance of deployed ML models and compute feature importance (a.k.a. explanations) for each prediction to help developers identify the root causes of emergent model performance problems. We present Quantile Demographic Drift (QDD), a novel model bias quantification metric that uses quantile binning to measure differences in the overall prediction distributions over subgroups. QDD is ideal for continuous monitoring scenarios, does not suffer from the statistical limitations of conventional threshold-based bias metrics, and does not require outcome labels (which may not be available at runtime). We incorporate QDD into a continuous model monitoring system, called FairCanary, that reuses existing explanations computed for each individual prediction to quickly compute explanations for the QDD bias metrics. This optimization makes FairCanary an order of magnitude faster than previous work that has tried to generate feature-level bias explanations.
翻译:现有监测系统不断跟踪已部署的ML模型的性能,并计算每项预测的特性重要性(a.k.a.解释),以帮助开发者查明新出现模型性能问题的根源。我们介绍了量子人口Drift(QDD),这是一种新型的模型偏差定量量化指标,使用量子积分组合来测量各分组总体预测分布的差别。QDD是持续监测情景的理想条件,不因基于门槛的常规偏差度指标的统计限制而受影响,也不要求结果标签(在运行时可能无法提供)。我们将QDD纳入一个连续模型监测系统,称为FairCanary,重新使用为每项单项预测计算的现有解释,以快速计算对QD偏见指标的解释。这种优化使FairCanary比以往试图产生地位偏差解释的工作要快得多。