私下和公平学习 (Stochastic Differentially Private and Fair Learning)

Machine learning models are increasingly used in high-stakes decision-making systems. In such applications, a major concern is that these models sometimes discriminate against certain demographic groups such as individuals with certain race, gender, or age. Another major concern in these applications is the violation of the privacy of users. While fair learning algorithms have been developed to mitigate discrimination issues, these algorithms can still leak sensitive information, such as individuals' health or financial records. Utilizing the notion of differential privacy (DP), prior works aimed at developing learning algorithms that are both private and fair. However, existing algorithms for DP fair learning are either not guaranteed to converge or require full batch of data in each iteration of the algorithm to converge. In this paper, we provide the first stochastic differentially private algorithm for fair learning that is guaranteed to converge. Here, the term "stochastic" refers to the fact that our proposed algorithm converges even when minibatches of data are used at each iteration (i.e. stochastic optimization). Our framework is flexible enough to permit different fairness notions, including demographic parity and equalized odds. In addition, our algorithm can be applied to non-binary classification tasks with multiple (non-binary) sensitive attributes. As a byproduct of our convergence analysis, we provide the first utility guarantee for a DP algorithm for solving nonconvex-strongly concave min-max problems. Our numerical experiments show that the proposed algorithm consistently offers significant performance gains over the state-of-the-art baselines, and can be applied to larger scale problems with non-binary target/sensitive attributes.

翻译：机器学习模式越来越多地用于高层决策系统。在这类应用中,一个主要的关切是,这些模式有时会歧视某些人口群体,如具有某些种族、性别或年龄的个人。这些应用的另一个主要关切是侵犯用户隐私。虽然已经开发了公平的学习算法以缓解歧视问题,但这些算法仍然可以泄漏敏感信息,如个人健康或财务记录。利用差异隐私概念(DP),先前旨在开发既私人又公平的学习算法的工程。然而,现有的DP公平学习算法要么没有保证趋同,要么需要每套代算法中的全部数据。在本文中,我们为公平学习提供了第一个有保证会趋同的随机的、有差别的私人算法。在这里,“随机”一词指的是我们提议的算法即使在每次循环使用数据迷你格格子(即感性调整)。我们的框架足够灵活,可以允许不同的公平概念,包括人口均等和等同的概率。此外,我们为保证公平学习的公平性算法分析提供了一种不透明的私人算法。此外,我们对于多种算法的算法分析可以用来证明我们不具有一定的数值的数值。我们的算法性,我们用来进行不敏感的算法的算法的算法,用来证明我们不具有一种不透明性的算法。