The Everyday Sexism Project documents everyday examples of sexism reported by volunteer contributors from all around the world. It collected 100,000 entries in 13+ languages within the first 3 years of its existence. The content of reports in various languages submitted to Everyday Sexism is a valuable source of crowdsourced information with great potential for feminist and gender studies. In this paper, we take a computational approach to analyze the content of reports. We use topic-modelling techniques to extract emerging topics and concepts from the reports, and to map the semantic relations between those topics. The resulting picture closely resembles and adds to that arrived at through qualitative analysis, showing that this form of topic modeling could be useful for sifting through datasets that had not previously been subject to any analysis. More precisely, we come up with a map of topics for two different resolutions of our topic model and discuss the connection between the identified topics. In the low resolution picture, for instance, we found Public space/Street, Online, Work related/Office, Transport, School, Media harassment, and Domestic abuse. Among these, the strongest connection is between Public space/Street harassment and Domestic abuse and sexism in personal relationships.The strength of the relationships between topics illustrates the fluid and ubiquitous nature of sexism, with no single experience being unrelated to another.
翻译:每天的性别主义项目都记录了世界各地志愿工作者报告的日常性别主义实例,在其存在的头3年中收集了10万个以13种以上语言提供的13种以上语言的条目。以各种语言向每天提交的性别主义报告的内容是大量来源的信息的宝贵来源,对女权主义和性别研究具有巨大的潜力。在本文件中,我们用一种计算方法来分析报告的内容。我们使用主题模型技术来从报告中提取新出现的主题和概念,并绘制这些主题之间的语义关系图。由此产生的图象非常接近并补充了通过定性分析得出的图象,表明这种主题模型对于通过以前没有进行过任何分析的数据集筛选可能是有用的。更准确地说,我们为我们的主题模型的两个不同决议绘制了主题图谱,并讨论了所查明的专题之间的联系。在低分辨率图中,我们发现了公共空间/街道、在线、工作相关/办公室、交通、学校、媒体骚扰和家庭暴力之间的语义关系。其中,公共空间/街头骚扰与不相干性质之间以及个人关系的不相干燥性之间有着最紧密的联系。