We propose methods for making inferences on the fairness and accuracy of a given classifier, using only aggregate population statistics. This is necessary when it is impossible to obtain individual classification data, for instance when there is no access to the classifier or to a representative individual-level validation set. We study fairness with respect to the equalized odds criterion, which we generalize to multiclass classification. We propose a measure of unfairness with respect to this criterion, which quantifies the fraction of the population that is treated unfairly. We then show how inferences on the unfairness and error of a given classifier can be obtained using only aggregate label statistics such as the rate of prediction of each label in each sub-population, as well as the true rate of each label. We derive inference procedures for binary classifiers and for multiclass classifiers, for the case where confusion matrices in each sub-population are known, and for the significantly more challenging case where they are unknown. We report experiments on data sets representing diverse applications, which demonstrate the effectiveness and the wide range of possible uses of the proposed methodology.
翻译:我们建议对某一分类器的公正性和准确性进行推断的方法,仅使用人口总数统计;当无法获得个别分类数据时,例如无法接触分类器或具有代表性的个人一级鉴定组时,这是必要的; 我们研究对公平率标准是否公平,我们将其归纳为多级分类; 我们对这一标准提出一种不公平的衡量标准,该标准量化了受到不公正对待的人口的一小部分; 然后,我们说明如何利用综合标签统计,例如每个分类器的预测率,以及每个分类器的真实率,获得对某一分类器的不公平和错误的推断; 我们为二元分类器和多级分类器的推断程序,每个分类器和多级分类器的推断程序,每个分类器的混乱矩阵为已知情况,以及它们未知的难度大得多的情况; 我们报告关于代表不同应用的数据集的实验,这些数据集显示了拟议方法的有效性和广泛用途。