The capabilities of natural language models trained on large-scale data have increased immensely over the past few years. Open source libraries such as HuggingFace have made these models easily available and accessible. While prior research has identified biases in large language models, this paper considers biases contained in the most popular versions of these models when applied `out-of-the-box' for downstream tasks. We focus on generative language models as they are well-suited for extracting biases inherited from training data. Specifically, we conduct an in-depth analysis of GPT-2, which is the most downloaded text generation model on HuggingFace, with over half a million downloads in the past month alone. We assess biases related to occupational associations for different protected categories by intersecting gender with religion, sexuality, ethnicity, political affiliation, and continental name origin. Using a template-based data collection pipeline, we collect 396K sentence completions made by GPT-2 and find: (i) The machine-predicted jobs are less diverse and more stereotypical for women than for men, especially for intersections; (ii) Intersectional interactions are highly relevant for occupational associations, which we quantify by fitting 262 logistic models; (iii) For most occupations, GPT-2 reflects the skewed gender and ethnicity distribution found in US Labour Bureau data, and even pulls the societally-skewed distribution towards gender parity in cases where its predictions deviate from real labor market observations. This raises the normative question of what language models _should_ learn - whether they should reflect or correct for existing inequalities.
翻译:过去几年来,在大规模数据方面受过培训的自然语言模型的能力大大提高了。Hugging Face等开放源码图书馆使这些模型容易获得和容易获得。虽然先前的研究已经查明了大型语言模型的偏见,但本文件在应用“箱外”的“箱外”的下游任务时,考虑了这些模型最受欢迎的版本中所含的偏见。我们注重基因化语言模型,因为这些模型非常适合从培训数据中获取偏见。具体地,我们对GPT-2进行了深入分析。GPT-2是Huging Face上下载最多的文本生成模型,仅在过去一个月就下载了50多万次。我们通过将性别与宗教、性、族裔、政治归属和大陆名称的渊源交错,评估了与不同保护类别职业协会有关的偏见。我们通过基于模板的数据收集管道,收集了396K句的完成结果,发现:(一) 机器制造的工作比男性的不平等程度要少,而且更加刻板化。 (二) 部门间互动,仅在过去一个月就反映了50万次的下载。我们评估了与不同保护类别相关的职业协会的偏见。,通过性别分类的分布,我们用262 将性别统计的模型来量化地反映了现有的性别统计学系的分类。