衡量性别语言字词嵌入的性别偏见要求脱钩语文性别信号 (Measuring Gender Bias in Word Embeddings of Gendered Languages Requires Disentangling Grammatical Gender Signals)

Does the grammatical gender of a language interfere when measuring the semantic gender information captured by its word embeddings? A number of anomalous gender bias measurements in the embeddings of gendered languages suggest this possibility. We demonstrate that word embeddings learn the association between a noun and its grammatical gender in grammatically gendered languages, which can skew social gender bias measurements. Consequently, word embedding post-processing methods are introduced to quantify, disentangle, and evaluate grammatical gender signals. The evaluation is performed on five gendered languages from the Germanic, Romance, and Slavic branches of the Indo-European language family. Our method reduces the strength of grammatical gender signals, which is measured in terms of effect size (Cohen's d), by a significant average of d = 1.3 for French, German, and Italian, and d = 0.56 for Polish and Spanish. Once grammatical gender is disentangled, the association between over 90% of 10,000 inanimate nouns and their assigned grammatical gender weakens, and cross-lingual bias results from the Word Embedding Association Test (WEAT) become more congruent with country-level implicit bias measurements. The results further suggest that disentangling grammatical gender signals from word embeddings may lead to improvement in semantic machine learning tasks.

翻译：一种语言的语法性别是否干扰了测量其字嵌入的语义性别信息?在性别语言嵌入的语义中,对性别偏见进行一些异常的测量,这表明了这种可能性。我们证明,语言嵌入的词学学会了名词与其语系性别语言的语法性别之间的关联,这可以扭曲社会性别偏见的衡量标准。因此,引入了文字嵌入后处理方法,以量化、分解和评估语系性别信号。评价是用来自德意志语、罗姆语和印欧语家族的五种性别语言进行的。我们的方法减少了语系性别信号的强度,用效果大小(Cohen'd)来衡量,用法语、德意志语和意大利语为1.3,用波兰语和西班牙语为0.56。语系的性别处理方法一旦分解,就将超过90 %的无语系语言及其指定的语系语言分支的性别分类学分支进行了评估。我们的方法减少了语系性别信号的强度,从效果大小衡量(Cohen's d=1.3,意大利语为 d=0.56,而波兰语系和西班牙语为语系语言系的性别倾向化分析结果在语言协会的学习中进一步变化。

相关内容

词向量表示

关注 37

分散式表示即将语言表示为稠密、低维、连续的向量。研究者最早发现学习得到词嵌入之间存在类比关系。比如apple−apples ≈ car−cars， man−woman ≈ king – queen 等。这些方法都可以直接在大规模无标注语料上进行训练。词嵌入的质量也非常依赖于上下文窗口大小的选择。通常大的上下文窗口学到的词嵌入更反映主题信息，而小的上下文窗口学到的词嵌入更反映词的功能和上下文语义信息。

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日