Despite the success of large-scale empirical risk minimization (ERM) at achieving high accuracy across a variety of machine learning tasks, fair ERM is hindered by the incompatibility of fairness constraints with stochastic optimization. We consider the problem of fair classification with discrete sensitive attributes and potentially large models and data sets, requiring stochastic solvers. Existing in-processing fairness algorithms are either impractical in the large-scale setting because they require large batches of data at each iteration or they are not guaranteed to converge. In this paper, we develop the first stochastic in-processing fairness algorithm with guaranteed convergence. For demographic parity, equalized odds, and equal opportunity notions of fairness, we provide slight variations of our algorithm--called FERMI--and prove that each of these variations converges in stochastic optimization with any batch size. Empirically, we show that FERMI is amenable to stochastic solvers with multiple (non-binary) sensitive attributes and non-binary targets, performing well even with minibatch size as small as one. Extensive experiments show that FERMI achieves the most favorable tradeoffs between fairness violation and test accuracy across all tested setups compared with state-of-the-art baselines for demographic parity, equalized odds, equal opportunity. These benefits are especially significant with small batch sizes and for non-binary classification with large number of sensitive attributes, making FERMI a practical fairness algorithm for large-scale problems.
翻译:尽管大规模实验风险最小化(ERM)成功地在各种机器学习任务中实现了高精度,但公平的机构风险管理却因公平限制与随机优化不相容而受到阻碍。我们考虑到以离散敏感属性以及潜在大模型和数据集进行公平分类的问题,需要随机求解者。在大规模环境下,处理中的公平算法要么是不切实际的,因为每次迭代都需要大量数据,或者不能保证它们会趋同。在本文件中,我们开发了第一个具有保证趋同性的处理公平性算法。关于人口均等、均等率和机会平等概念,我们提供了我们所谓的“FERMI”的算法的微小差异,并证明所有这些变法都与任何批量的随机优化相融合。我们随机地表明,FERMI容易使用具有多种(非双重)敏感敏感属性和非双重目标的随机求解解算法,即使其规模小,也保证了小型的趋同性。广泛的实验表明,FERMI在规模上实现了最易实现的大规模交易的偏向性交易质量,并且测试了各种公平性基准比重的大小。我们显示,这些公平性测试是公平性、公平性、公平性、公平性基准比等公平性、公平性、公平性、公平性、公平性、公平性、公平性、分级交易的大小的大小的大小的大小。