诱导偏见比你想的简单多了 (Inducing bias is simpler than you think)

Machine learning may be oblivious to human bias but it is not immune to its perpetuation. Marginalisation and iniquitous group representation are often traceable in the very data used for training, and may be reflected or even enhanced by the learning models. To counter this, some of the model accuracy can be traded off for a secondary objective that helps prevent a specific type of bias. Multiple notions of fairness have been proposed to this end but recent studies show that some fairness criteria often stand in mutual competition. In the present work, we introduce a solvable high-dimensional model of data imbalance, where parametric control over the many bias-inducing factors allows for an extensive exploration of the bias inheritance mechanism. Through the tools of statistical physics, we analytically characterise the typical behaviour of learning models trained in our synthetic framework and find similar unfairness behaviours as those observed on more realistic data. However, we also identify a positive transfer effect between the different subpopulations within the data. This suggests that mixing data with different statistical properties could be helpful, provided the learning model is made aware of this structure. Finally, we analyse the issue of bias mitigation: by reweighing the various terms in the training loss, we indirectly minimise standard unfairness metrics and highlight their incompatibilities. Leveraging the insights on positive transfer, we also propose a theory-informed mitigation strategy, based on the introduction of coupled learning models. By allowing each model to specialise on a different community within the data, we find that multiple fairness criteria and high accuracy can be achieved simultaneously.

翻译：在目前的工作中,我们引入了一个可以解脱的高度数据不平衡模式,在这个模式中,对许多引起偏向的因素进行分辨性控制,从而可以同时广泛探索偏向继承机制。我们通过统计物理工具,分析我们综合框架所培训的学习模式的典型行为特征,发现与比较现实的数据所观察到的相似的不公平行为。然而,我们还确定了数据中不同亚群之间的积极转移效应。这说明,与不同统计属性的数据相混合的模型可以有所帮助,只要学习模型了解这一结构。最后,我们通过统计物理工具,分析我们综合框架所培训的学习模式的典型行为,并发现与比较现实数据所观察到的相似的不公平行为。我们还可以找出数据中不同子群群之间的积极转移效应。这表示,只要了解这一结构,与不同统计属性的数据相混合的模型可以有所帮助。最后,我们通过分析减少偏向偏向性因素的问题,通过间接地分析在综合框架中学习的各种标准,我们可以提出一种最接近的、最接近标准化的推理算。