Context: Contemporary software development organizations lack diversity and the ratios of women in Free and open-source software (FOSS) communities are even lower than the industry average. Although the results of recent studies hint the existence of biases against women, it is unclear to what extent such biases influence the outcomes of various software development tasks. Aim: We aim to identify whether the outcomes of or participation in code reviews (or pull requests) are influenced by the gender of a developer.. Approach: With this goal, this study includes a total 1010 FOSS projects. We developed six regression models for each of the 14 dataset (i.e., 10 Gerrit based and four Github) to identify if code acceptance, review intervals, and code review participation differ based on the gender and gender neutral profile of a developer. Key findings: Our results find significant gender biases during code acceptance among 13 out of the 14 datasets, with seven seven favoring men and the remaining six favoring women. We also found significant differences between men and women in terms of code review intervals, with women encountering longer delays in three cases and the opposite in seven. Our results indicate reviewer selection as one of the most gender biased aspects among most of the projects, with women having significantly lower code review participation among 11 out of the 14 cases. Since most of the review assignments are based on invitations, this result suggests possible affinity biases among the developers. Conclusion: Though gender bias exists among many projects, direction and amplitude of bias varies based on project size, community and culture. Similar bias mitigation strategies may not work across all communities, as characteristics of biases and their underlying causes differ.
翻译:目标:我们的目标是确定代码审查的结果(或拉动请求)是否受到开发者的性别影响。方法:根据这一目标,本研究包括总共1010个自由和开放源码软件项目。我们为14个数据集(即10个Gerrit基础和4个Githhub)中的每个数据集(即10个基于Gerrit基础和4个Githhub)开发了6个不同的回归模型,以确定代码接受、审查间隔和代码审查的参与程度是否因开发者的性别和性别中立而不同。关键结论:我们的结果发现,在14个数据集中,13个数据集中,有7个数据集的接受或参与结果(或拉动请求)是否受到性别的影响。方法:我们发现,在代码审查的间隔方面,男女差异很大,在3个和7个之间,妇女遇到更长的延迟。我们的结果显示,社区对代码接受、审查间隔和代码审查的参与程度不同。我们的结果显示,根据最有性别偏见的项目选择,在11个项目中,多数是依据性别平等原则进行的。