Machine learning (ML) is increasingly being used in high-stakes applications impacting society. Therefore, it is of critical importance that ML models do not propagate discrimination. Collecting accurate labeled data in societal applications is challenging and costly. Active learning is a promising approach to build an accurate classifier by interactively querying an oracle within a labeling budget. We design algorithms for fair active learning that carefully selects data points to be labeled so as to balance model accuracy and fairness. We demonstrate the effectiveness and efficiency of our proposed algorithms over widely used benchmark datasets using demographic parity and equalized odds notions of fairness.
翻译:机器学习(ML)正越来越多地用于影响社会的高级应用中,因此,至关重要的是ML模式不传播歧视。在社会应用中收集准确的标签数据既具有挑战性又昂贵。积极学习是通过在标签预算内交互查询神器来建立准确分类的有希望的方法。我们设计公平积极学习的算法,仔细选择要标出的数据点,以平衡模型的准确性和公平性。我们用人口均等和公平性等同的概率概念来证明我们提议的算法对广泛使用的基准数据集的有效性和效率。