As social issues related to gender bias attract closer scrutiny, accurate tools to determine the gender profile of large groups become essential. When explicit data is unavailable, gender is often inferred from names. Current methods follow a strategy whereby individuals of the group, one by one, are assigned a gender label or probability based on gender-name correlations observed in the population at large. We show that this strategy is logically inconsistent and has practical shortcomings, the most notable of which is the systematic underestimation of gender bias. We introduce a global inference strategy that estimates gender composition according to the context of the full list of names. The tool suffers from no intrinsic methodological effects, is robust against errors, easily implemented, and computationally light.
翻译:暂无翻译