There is a growing body of work that proposes methods for mitigating bias in machine learning systems. These methods typically rely on access to protected attributes such as race, gender, or age. However, this raises two significant challenges: (1) protected attributes may not be available or it may not be legal to use them, and (2) it is often desirable to simultaneously consider multiple protected attributes, as well as their intersections. In the context of mitigating bias in occupation classification, we propose a method for discouraging correlation between the predicted probability of an individual's true occupation and a word embedding of their name. This method leverages the societal biases that are encoded in word embeddings, eliminating the need for access to protected attributes. Crucially, it only requires access to individuals' names at training time and not at deployment time. We evaluate two variations of our proposed method using a large-scale dataset of online biographies. We find that both variations simultaneously reduce race and gender biases, with almost no reduction in the classifier's overall true positive rate.
翻译:越来越多的工作提出了减少机器学习系统偏见的方法,这些方法通常依靠获得种族、性别或年龄等受保护属性的机会,然而,这提出了两大挑战:(1) 受保护属性可能不存在,或者使用这些属性可能不合法,(2) 通常需要同时考虑多种受保护属性及其交叉点。在减少职业分类中的偏见方面,我们提议了一种方法,以抑制个人真实职业的预测概率与其姓名嵌入的单词之间的关联。这种方法利用了在语言嵌入中编码的社会偏见,消除了获取受保护属性的需要。关键是,它只要求在培训时间而不是部署时间获取个人的姓名。我们用在线生物图谱的大规模数据集评估我们拟议方法的两种变异。我们发现,两种变异同时减少种族和性别偏见,几乎没有减少分类者的总体真实积极率。