The presence of toxic and gender-identity derogatory language in open-source software (OSS) communities has recently become a focal point for researchers. Such comments not only lead to frustration and disengagement among developers but may also influence their leave from the OSS projects. Despite ample evidence suggesting that diverse teams enhance productivity, the existence of toxic or gender identity discriminatory communications poses a significant threat to the participation of individuals from marginalized groups and, as such, may act as a barrier to fostering diversity and inclusion in OSS projects. However, there is a notable lack of research dedicated to exploring the association between gender-based toxic and derogatory language with a perceptible diversity of open-source software teams. Consequently, this study aims to investigate how such content influences the gender, ethnicity, and tenure diversity of open-source software development teams. To achieve this, we extract data from active GitHub projects, assess various project characteristics, and identify instances of toxic and gender-discriminatory language within issue/pull request comments. Using these attributes, we construct a regression model to explore how they associate with the perceptible diversity of those projects.
翻译:暂无翻译