态度说明者:注意者信仰和特征如何比起有毒语言检测 (Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection) - 专知论文

会员服务 ·

0

对抗自编码 · 有偏 · 讲稿 · 可理解性 · Less ·

2022 年 5 月 9 日

Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection

翻译：态度说明者:注意者信仰和特征如何比起有毒语言检测

Maarten Sap,Swabha Swayamdipta,Laura Vianna,Xuhui Zhou,Yejin Choi,Noah A. Smith

from arxiv, NAACL 2022 Camera Ready

The perceived toxicity of language can vary based on someone's identity and beliefs, but this variation is often ignored when collecting toxic language datasets, resulting in dataset and model biases. We seek to understand the who, why, and what behind biases in toxicity annotations. In two online studies with demographically and politically diverse participants, we investigate the effect of annotator identities (who) and beliefs (why), drawing from social psychology research about hate speech, free speech, racist beliefs, political leaning, and more. We disentangle what is annotated as toxic by considering posts with three characteristics: anti-Black language, African American English (AAE) dialect, and vulgarity. Our results show strong associations between annotator identity and beliefs and their ratings of toxicity. Notably, more conservative annotators and those who scored highly on our scale for racist beliefs were less likely to rate anti-Black language as toxic, but more likely to rate AAE as toxic. We additionally present a case study illustrating how a popular toxicity detection system's ratings inherently reflect only specific beliefs and perspectives. Our findings call for contextualizing toxicity labels in social variables, which raises immense implications for toxic language annotation and detection.

翻译：语言的可感知毒性可能因某人的身份和信仰而不同,但在收集有毒语言数据集时,这种差异往往被忽视,导致数据组和模型偏差。我们试图了解毒性说明中的偏见背后是谁、为什么和是什么。在对人口和政治多样性参与者进行的两项在线研究中,我们从关于仇恨言论、言论自由、种族主义信仰、政治倾斜等的社会心理学研究中,对批注身份(谁)和信仰(为什么)的影响进行了调查。我们通过考虑三个特征的标签(反黑人语言、非裔美国人英语(AE)方言和粗俗),将附加注释的有毒内容混为一谈。我们的调查结果显示,在说明身份和信仰及其毒性等级之间有着强烈的联系。值得注意的是,保守的批注者和那些对我们种族主义信仰有高度评价的人不太可能将反黑人语言评为有毒,但更可能将AAE评为有毒。我们还提出一份个案研究,说明流行的毒性检测系统评级本身如何反映具体的信仰和观点。我们的调查结果要求在社会变量中贴上毒性标签,这给有毒语言带来巨大的影响。

0

相关内容

对抗自编码

对抗自编码

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

159+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

单轴应变硅MOSFET栅极漏电流及NBTI效应诱发的器件退化机制与模型研究

国家自然科学基金

0+阅读 · 2014年12月31日

硅MEMS陀螺驱动/检测模态频率调谐自适应控制方法和实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

SiCp预制体孔洞三维特征及其对铝液浸渗流动行为影响规律的研究

国家自然科学基金

0+阅读 · 2012年12月31日

声子晶体和声超构材料的Schoch效应

国家自然科学基金

0+阅读 · 2012年12月31日

镍基高温合金的高温超高周疲劳失效机理与寿命预测

国家自然科学基金

0+阅读 · 2011年12月31日

CuInS2量子点敏化纳米TiO2太阳电池的界面电子复合机理研究

国家自然科学基金

0+阅读 · 2010年12月31日

Ni3Al基合金单晶生长规律研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于格子Boltzmann方法的高温非平衡气体辐射传输研究

国家自然科学基金

0+阅读 · 2009年12月31日

CFRP构件与Al合金构件胶接界面特性与失效机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

The Topological BERT: Transforming Attention into Topology for Natural Language Processing

The Topological BERT: Transforming Attention into Topology for Natural Language Processing

Arxiv

0+阅读 · 2022年6月30日

De-biasing "bias" measurement

Arxiv

0+阅读 · 2022年6月29日

Cross-Silo Heterogeneous Model Federated Multitask Learning

Cross-Silo Heterogeneous Model Federated Multitask Learning

Arxiv

0+阅读 · 2022年6月29日

Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?

Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?

Arxiv

0+阅读 · 2022年6月29日

How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection

Arxiv

0+阅读 · 2022年6月28日

Dext-Gen: Dexterous Grasping in Sparse Reward Environments with Full Orientation Control

Dext-Gen: Dexterous Grasping in Sparse Reward Environments with Full Orientation Control

Arxiv

0+阅读 · 2022年6月28日

"Stop Asian Hate!" : Refining Detection of Anti-Asian Hate Speech During the COVID-19 Pandemic

Arxiv

0+阅读 · 2022年6月28日

Exploring linguistic feature and model combination for speech recognition based automatic AD detection

Arxiv

0+阅读 · 2022年6月28日

Supervised Learning with General Risk Functionals

Arxiv

0+阅读 · 2022年6月27日

Affective Image Content Analysis: Two Decades Review and New Perspectives

Arxiv

16+阅读 · 2021年6月30日

VIP会员

文章信息

相关主题

对抗自编码

相关VIP内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

159+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

中文资讯 | 美陆军完成欧洲大型网络现代化改造项目

《算法战略：在军事及地缘政治决策中取代人类判断》最新资料

中文版4300字 | 人工智能在空域指挥控制与教育训练演习评估领域的应用探索

《军事行动研究：经验、成果与启示》2025最新385页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

The Topological BERT: Transforming Attention into Topology for Natural Language Processing

The Topological BERT: Transforming Attention into Topology for Natural Language Processing

Arxiv

0+阅读 · 2022年6月30日

De-biasing "bias" measurement

Arxiv

0+阅读 · 2022年6月29日

Cross-Silo Heterogeneous Model Federated Multitask Learning

Cross-Silo Heterogeneous Model Federated Multitask Learning

Arxiv

0+阅读 · 2022年6月29日

Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?

Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?

Arxiv

0+阅读 · 2022年6月29日

How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection

Arxiv

0+阅读 · 2022年6月28日

Dext-Gen: Dexterous Grasping in Sparse Reward Environments with Full Orientation Control

Dext-Gen: Dexterous Grasping in Sparse Reward Environments with Full Orientation Control

Arxiv

0+阅读 · 2022年6月28日

"Stop Asian Hate!" : Refining Detection of Anti-Asian Hate Speech During the COVID-19 Pandemic

Arxiv

0+阅读 · 2022年6月28日

Exploring linguistic feature and model combination for speech recognition based automatic AD detection

Arxiv

0+阅读 · 2022年6月28日

Supervised Learning with General Risk Functionals

Arxiv

0+阅读 · 2022年6月27日

Affective Image Content Analysis: Two Decades Review and New Perspectives

Arxiv

16+阅读 · 2021年6月30日

相关基金

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

单轴应变硅MOSFET栅极漏电流及NBTI效应诱发的器件退化机制与模型研究

国家自然科学基金

0+阅读 · 2014年12月31日

硅MEMS陀螺驱动/检测模态频率调谐自适应控制方法和实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

SiCp预制体孔洞三维特征及其对铝液浸渗流动行为影响规律的研究

国家自然科学基金

0+阅读 · 2012年12月31日

声子晶体和声超构材料的Schoch效应

国家自然科学基金

0+阅读 · 2012年12月31日

镍基高温合金的高温超高周疲劳失效机理与寿命预测

国家自然科学基金

0+阅读 · 2011年12月31日

CuInS2量子点敏化纳米TiO2太阳电池的界面电子复合机理研究

国家自然科学基金

0+阅读 · 2010年12月31日

Ni3Al基合金单晶生长规律研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于格子Boltzmann方法的高温非平衡气体辐射传输研究

国家自然科学基金

0+阅读 · 2009年12月31日

CFRP构件与Al合金构件胶接界面特性与失效机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员