Ethical bias in machine learning models has become a matter of concern in the software engineering community. Most of the prior software engineering works concentrated on finding ethical bias in models rather than fixing it. After finding bias, the next step is mitigation. Prior researchers mainly tried to use supervised approaches to achieve fairness. However, in the real world, getting data with trustworthy ground truth is challenging and also ground truth can contain human bias. Semi-supervised learning is a machine learning technique where, incrementally, labeled data is used to generate pseudo-labels for the rest of the data (and then all that data is used for model training). In this work, we apply four popular semi-supervised techniques as pseudo-labelers to create fair classification models. Our framework, Fair-SSL, takes a very small amount (10%) of labeled data as input and generates pseudo-labels for the unlabeled data. We then synthetically generate new data points to balance the training data based on class and protected attribute as proposed by Chakraborty et al. in FSE 2021. Finally, the classification model is trained on the balanced pseudo-labeled data and validated on test data. After experimenting on ten datasets and three learners, we find that Fair-SSL achieves similar performance as three state-of-the-art bias mitigation algorithms. That said, the clear advantage of Fair-SSL is that it requires only 10% of the labeled training data. To the best of our knowledge, this is the first SE work where semi-supervised techniques are used to fight against ethical bias in SE ML models.
翻译:机器学习模型中的道德偏向已成为软件工程界关注的一个问题。 大多数先前的软件工程工作都集中在寻找模型中的道德偏见而不是修复模型上。 在发现偏差之后,下一步是减缓。 以前的研究人员主要试图使用监督的方法来实现公平。 但是,在现实世界中,获得具有可信赖地面真理的数据是具有挑战性的,而且实地真理也可以包含人类偏见。 半监督学习是一种机器学习技术,在这个技术中,标签数据被用来为其余数据生成假标签(然后所有数据都用于模型培训 ) 。 在这项工作中,我们应用了四种流行的半监督的半高级技术作为假标签来创建公平的分类模型。 我们的框架,即公平SSL, 以非常小的数量(10%)的贴标签数据作为输入, 并生成无标签数据的假标签标签标签可以包含人类偏见。 然后,我们合成了新的数据点来平衡基于Chakraborbortty等人提议的课和保护属性的培训数据(然后所有数据都用于模型的培训 ) 。 最后, 在这项工作中,我们应用了四种平衡的伪标签的半监督技术作为假标签的半高级标签, 和验证了SEEEL 10级数据测试的测试结果, 之后, 我们用了三个的测试数据是用来在测试了10级数据运行中 。