The rise of algorithmic decision-making has spawned much research on fair machine learning (ML). Financial institutions use ML for building risk scorecards that support a range of credit-related decisions. Yet, the literature on fair ML in credit scoring is scarce. The paper makes three contributions. First, we revisit statistical fairness criteria and examine their adequacy for credit scoring. Second, we catalog algorithmic options for incorporating fairness goals in the ML model development pipeline. Last, we empirically compare different fairness processors in a profit-oriented credit scoring context using real-world data. The empirical results substantiate the evaluation of fairness measures, identify suitable options to implement fair credit scoring, and clarify the profit-fairness trade-off in lending decisions. We find that multiple fairness criteria can be approximately satisfied at once and recommend separation as a proper criterion for measuring the fairness of a scorecard. We also find fair in-processors to deliver a good balance between profit and fairness and show that algorithmic discrimination can be reduced to a reasonable level at a relatively low cost. The codes corresponding to the paper are available on GitHub.
翻译:算法决策的兴起催生了对公平机器学习的大量研究。金融机构利用ML建立风险记分卡支持一系列与信贷有关的决定。然而,关于公平信用记分的文献却很少。该文件作出了三点贡献。首先,我们重新审视统计公平标准,并审查它们是否适合信用评分。第二,我们将公平目标纳入ML模式开发管道的公平目标分类算法选项。最后,我们实证地比较了利用真实世界数据在以利润为导向的信用评分背景下的不同公平处理者。实证结果证实了对公平措施的评价,确定了实施公平信用评分的适当选项,并澄清了贷款决定中的利润公平交易。我们发现多种公平标准可以一次大致得到满足,并建议将其分离作为衡量记分卡公平性的适当标准。我们还找到了公平的处理者,以便在利润和公平之间实现良好的平衡,并表明可以以相对较低的成本将算法歧视降低到合理的水平。与文件相对相关的代码可以在GitHub上找到。