Multiple studies have demonstrated that behavior on internet-based social media platforms can be indicative of an individual's mental health status. The widespread availability of such data has spurred interest in mental health research from a computational lens. While previous research has raised concerns about possible biases in models produced from this data, no study has quantified how these biases actually manifest themselves with respect to different demographic groups, such as gender and racial/ethnic groups. Here, we analyze the fairness of depression classifiers trained on Twitter data with respect to gender and racial demographic groups. We find that model performance systematically differs for underrepresented groups and that these discrepancies cannot be fully explained by trivial data representation issues. Our study concludes with recommendations on how to avoid these biases in future research.
翻译:多种研究表明,在互联网上社交媒体平台上的行为可以表明个人的心理健康状况,这些数据的广泛提供从计算角度激发了人们对心理健康研究的兴趣。虽然先前的研究引起了人们对从这些数据中产生的模型可能存在的偏见的关切,但没有研究量化这些偏见在性别和种族/族裔群体等不同人口群体中的实际表现。我们在此分析在推特数据方面受过培训的抑郁分类人员在性别和种族人口群体方面的公平性。我们发现,模型表现对代表性不足的群体来说是系统性的,这些差异不能用微不足道的数据代表性问题来充分解释。我们的研究最后就如何在今后的研究中避免这些偏见提出了建议。