Ranking algorithms are being widely employed in various online hiring platforms including LinkedIn, TaskRabbit, and Fiverr. Since these platforms impact the livelihood of millions of people, it is important to ensure that the underlying algorithms are not adversely affecting minority groups. However, prior research has demonstrated that ranking algorithms employed by these platforms are prone to a variety of undesirable biases. To address this problem, fair ranking algorithms (e.g.,Det-Greedy) which increase exposure of underrepresented candidates have been proposed in recent literature. However, there is little to no work that explores if these proposed fair ranking algorithms actually improve real world outcomes (e.g., hiring decisions) for minority groups. Furthermore, there is no clear understanding as to how other factors (e.g., jobcontext, inherent biases of the employers) play a role in impacting the real world outcomes of minority groups. In this work, we study how gender biases manifest in online hiring platforms and how they impact real world hiring decisions. More specifically, we analyze various sources of gender biases including the nature of the ranking algorithm, the job context, and inherent biases of employers, and establish how these factors interact and affect real world hiring decisions. To this end, we experiment with three different ranking algorithms on three different job contexts using real world data from TaskRabbit. We simulate the hiring scenarios on TaskRabbit by carrying out a large-scale user study with Amazon Mechanical Turk. We then leverage the responses from this study to understand the effect of each of the aforementioned factors. Our results demonstrate that fair ranking algorithms can be an effective tool at increasing hiring of underrepresented gender candidates but induces inconsistent outcomes across candidate features and job contexts.
翻译:在包括LinkedIn、TatterRabbit和Fiverr等各种在线招聘平台中,正在广泛采用排名算法。由于这些拟议的公平排名算法对数百万人的生活产生影响,因此必须确保基本算法不会对少数群体产生不利影响。然而,先前的研究表明,这些平台采用的排名算法容易产生各种不受欢迎的偏见。为了解决这一问题,在最近的文献中提出了公平排序算法(例如,Det-Greedy),增加了任职人数不足的候选人的曝光率。然而,如果这些拟议的公平排名算法实际上改善了少数群体的真实世界结果(例如,招聘决定),则几乎没有任何工作可以进行探讨。此外,对于其他因素(例如,工作背景、雇主的固有偏差、雇主的固有偏见)如何影响少数群体群体的真实世界结果。在这项工作中,我们研究在线招聘平台中如何表现出性别偏见,以及它们如何影响真正的世界招聘决定。更具体地说,我们分析各种性别偏见的来源,包括排名算法的性质、工作背景、招聘决定的内在偏差性,以及三个客户的性别比例,我们用不同的方法来分析这些真实的性别等级推算。