In this article, we investigate the role of gender in collaboration patterns by analyzing gender-based homophily -- the tendency for researchers to co-author with individuals of the same gender. We develop and apply novel methodology to the corpus of JSTOR articles, a broad scholarly landscape, which we analyze at various levels of granularity. Most notably, for a precise analysis of gender homophily, we develop methodology which explicitly accounts for the fact that the data comprises heterogeneous intellectual communities and that not all authorships are exchangeable. In particular, we distinguish three phenomena which may affect the distribution of observed gender homophily in collaborations: a structural component that is due to demographics and non-gendered authorship norms of a scholarly community, a compositional component which is driven by varying gender representation across sub-disciplines and time, and a behavioral component which we define as the remainder of observed gender homophily after its structural and compositional components have been taken into account. Using minimal modeling assumptions, the methodology we develop allows us to test for behavioral homophily. We find that statistically significant behavioral homophily can be detected across the JSTOR corpus and show that this finding is robust to missing gender indicators in our data. In a secondary analysis, we show that the proportion of women representation in a field is positively associated with the probability of finding statistically significant behavioral homophily.
翻译:在本篇文章中,我们通过分析基于性别的同质研究者与同一性别的个人共同撰写文章的倾向来调查性别在合作模式中的作用。我们制定并应用了新颖的方法,这是一份广泛的学术背景,我们从不同层次的颗粒上分析。最值得注意的是,为了精确地分析性别同质,我们制定了方法,明确说明数据由不同知识群体组成,并非所有作者都是可交换的。特别是,我们区分了三种现象,这三种现象可能会在协作中影响所观察到的性别的分布:一个结构性因素,其原因是学术界的人口和非性别的作者规范;一个组成因素,其驱动因素是不同学科和时间的性别代表性;一个行为因素,我们将其定义为在结构性和组成组成部分得到考虑之后所观察到的性别的其余同性。我们制定的方法使我们能够使用最低限度的模型假设测试行为共性。我们发现,统计上重要的行为共性特征可以在整个学术群体中检测出来,一个由不同学科和时间的作者规范驱动的构成,一个由不同性别代表性驱动的构成要素,一个我们定义的演化的演化因素是,一个我们在共同分析中发现一个与性别相关的数据的比例。