In recent years, machine learning (ML) algorithms have been deployed in safety-critical and high-stake decision-making, where the fairness of algorithms is of paramount importance. Fairness in ML centers on detecting bias towards certain demographic populations induced by an ML classifier and proposes algorithmic solutions to mitigate the bias with respect to different fairness definitions. To this end, several fairness verifiers have been proposed that compute the bias in the prediction of an ML classifier -- essentially beyond a finite dataset -- given the probability distribution of input features. In the context of verifying linear classifiers, existing fairness verifiers are limited by accuracy due to imprecise modelling of correlations among features and scalability due to restrictive formulations of the classifiers as SSAT or SMT formulas or by sampling. In this paper, we propose an efficient fairness verifier, called FVGM, that encodes the correlations among features as a Bayesian network. In contrast to existing verifiers, FVGM proposes a stochastic subset-sum based approach for verifying linear classifiers. Experimentally, we show that FVGM leads to an accurate and scalable assessment for more diverse families of fairness-enhancing algorithms, fairness attacks, and group/causal fairness metrics than the state-of-the-art. We also demonstrate that FVGM facilitates the computation of fairness influence functions as a stepping stone to detect the source of bias induced by subsets of features.
翻译:近年来,机器学习(ML)算法在安全关键和高取量决策中部署,而算法的公平性至关重要。ML的公平性在于发现由ML分类师或抽样分析师对某些人口群的偏向,并提出了减少不同公平定义偏向的算法解决办法。为此,提议数个公平性核查员计算ML分类员预测中的偏向性 -- -- 基本上超出一个有限的数据集 -- -- 这是因为输入特征的概率分布。在核查线性分类员方面,现有的公平性核查员受到准确性的限制,原因是不精确地模拟了特征之间的相关性和可伸缩性。由于以SSAT或SMT公式或抽样方法对分类员的限制性表述,我们提出一个高效的公平性核查员,称为FVGM,将各种特征之间的关联编码成一个巴伊西亚网络。与现有的校验员相比,FVGM提出了一种基于随机性子和子数的方法来核查线性分类员。我们实验性地表明,FVGGM导致一种准确性和可衡量的公平性,也表明我们为更精确和可衡量的公平性地评估的FVGM标准的公平性,以更精确性、更精确地评估的公平性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性地测量性地测量性地测量性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性、更精确性地测量性、更能性地测量性地评估性地测量性地测量性地测量性地测量性地测量性地评估。