在跨语言语言模式中量化对政治家的性别偏见 (Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models) - 专知论文

会员服务 ·

0

语言模型化 · 有偏 · MoDELS · Performer · SimPLe ·

2021 年 4 月 15 日

Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

翻译：在跨语言语言模式中量化对政治家的性别偏见

Karolina Stańczak,Sagnik Ray Choudhury,Tiago Pimentel,Ryan Cotterell,Isabelle Augenstein

While the prevalence of large pre-trained language models has led to significant improvements in the performance of NLP systems, recent research has demonstrated that these models inherit societal biases extant in natural language. In this paper, we explore a simple method to probe pre-trained language models for gender bias, which we use to effect a multi-lingual study of gender bias towards politicians. We construct a dataset of 250k politicians from most countries in the world and quantify adjective and verb usage around those politicians' names as a function of their gender. We conduct our study in 7 languages across 6 different language modeling architectures. Our results demonstrate that stance towards politicians in pre-trained language models is highly dependent on the language used. Finally, contrary to previous findings, our study suggests that larger language models do not tend to be significantly more gender-biased than smaller ones.

翻译：虽然经过培训的大型语言模式的普及导致国家语言方案系统业绩的显著改善,但最近的研究表明,这些模式继承了自然语言中存在的社会偏见。在本文中,我们探索了一种简单的方法来调查经过培训的性别偏见语言模式,我们用这种方法对政治家的性别偏见进行多语种研究。我们构建了全世界大多数国家250k名政治家的数据集,并量化了这些政治家姓名的形容和动词用法,以此作为其性别的函数。我们用7种语言在6种不同的语言模式中进行研究。我们的结果表明,在经过培训的语言模式中,对政治家的立场高度依赖所使用的语言。最后,与以往的调查结果相反,我们的研究显示,更大的语言模式往往不会比较小的模式多得多。

0

相关内容

语言模型化

语言模型化

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

专知会员服务

97+阅读 · 2020年4月10日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

Grounding inductive biases in natural images:invariance stems from variations in data

Grounding inductive biases in natural images:invariance stems from variations in data

Arxiv

0+阅读 · 2021年6月9日

Probing Multilingual Language Models for Discourse

Arxiv

0+阅读 · 2021年6月9日

Does class size matter? An in-depth assessment of the effect of class size in software defect prediction

Arxiv

0+阅读 · 2021年6月8日

RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models

Arxiv

0+阅读 · 2021年6月7日

On the Language Coverage Bias for Neural Machine Translation

Arxiv

0+阅读 · 2021年6月7日

Robustness Testing of Language Understanding in Task-Oriented Dialog

Arxiv

0+阅读 · 2021年6月4日

Towards Equal Gender Representation in the Annotations of Toxic Language Detection

Arxiv

0+阅读 · 2021年6月4日

Syntax-augmented Multilingual BERT for Cross-lingual Transfer

Arxiv

0+阅读 · 2021年6月3日

Language Embeddings for Typology and Cross-lingual Transfer Learning

Arxiv

0+阅读 · 2021年6月3日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

20篇「ACL2020」最新论文抢先看！看自然语言处理2020在研究什么？

专知会员服务

97+阅读 · 2020年4月10日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军徒步机动作战条令手册》最新168页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

军事后勤数字化未来展望

《美海军后勤体系整合与创新挑战》最新报告

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Grounding inductive biases in natural images:invariance stems from variations in data

Grounding inductive biases in natural images:invariance stems from variations in data

Arxiv

0+阅读 · 2021年6月9日

Probing Multilingual Language Models for Discourse

Arxiv

0+阅读 · 2021年6月9日

Does class size matter? An in-depth assessment of the effect of class size in software defect prediction

Arxiv

0+阅读 · 2021年6月8日

RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models

Arxiv

0+阅读 · 2021年6月7日

On the Language Coverage Bias for Neural Machine Translation

Arxiv

0+阅读 · 2021年6月7日

Robustness Testing of Language Understanding in Task-Oriented Dialog

Arxiv

0+阅读 · 2021年6月4日

Towards Equal Gender Representation in the Annotations of Toxic Language Detection

Arxiv

0+阅读 · 2021年6月4日

Syntax-augmented Multilingual BERT for Cross-lingual Transfer

Arxiv

0+阅读 · 2021年6月3日

Language Embeddings for Typology and Cross-lingual Transfer Learning

Arxiv

0+阅读 · 2021年6月3日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

微信扫码咨询专知VIP会员