处理有毒言语检测中的偏见:调查 (Handling Bias in Toxic Speech Detection: A Survey) - 专知论文

会员服务 ·

0

有偏 · 偏置偏移 · 可约的 · Automator · Continuity ·

2022 年 2 月 2 日

Handling Bias in Toxic Speech Detection: A Survey

翻译：处理有毒言语检测中的偏见:调查

Tanmay Garg,Sarah Masud,Tharun Suresh,Tanmoy Chakraborty

from arxiv, 28 pages, 4 figures, 7 tables

The massive growth of social media usage has witnessed a tsunami of online toxicity in teams of hate speech, abusive posts, cyberbullying, etc. Detecting online toxicity is challenging due to its inherent subjectivity. Factors such as the context of the speech, geography, socio-political climate, and background of the producers and consumers of the posts play a crucial role in determining if the content can be flagged as toxic. Adoption of automated toxicity detection models in production can lead to a sidelining of the various demographic and psychographic groups they aim to help in the first place. It has piqued researchers' interest in examining unintended biases and their mitigation. Due to the nascent and multi-faceted nature of the work, complete literature is chaotic in its terminologies, techniques, and findings. In this paper, we put together a systematic study to discuss the limitations and challenges of existing methods. We start by developing a taxonomy for categorising various unintended biases and a suite of evaluation metrics proposed to quantify such biases. We take a closer look at each proposed method for evaluating and mitigating bias in toxic speech detection. To examine the limitations of existing methods, we also conduct a case study to introduce the concept of bias shift due to knowledge-based bias mitigation methods. The survey concludes with an overview of the critical challenges, research gaps and future directions. While reducing toxicity on online platforms continues to be an active area of research, a systematic study of various biases and their mitigation strategies will help the research community produce robust and fair models.

翻译：社交媒体使用量的大规模增长见证了仇恨言论、虐待性文章、网络欺凌等团队在线毒性的海啸,这些团队中出现了仇恨言论、滥用性文章、网络欺凌等等的在线毒性的海啸,由于其固有的主观性,检测在线毒性具有挑战性; 诸如言论、地理、社会政治气候等背景因素,以及这些文章的制作者和消费者的背景,在确定内容是否可标注为有毒方面发挥着关键作用; 生产过程中采用自动毒性检测模型,可以导致他们首先旨在帮助的各种人口和精神病群体被排挤在一边; 研究人员对研究意外偏向和减轻其偏向的兴趣感兴趣。由于工作的初创性和多面性,完整的文献在其术语、技术和结果方面混乱不清。在本文中,我们共同开展系统研究,讨论现有方法的局限性和挑战。我们首先开发一种分类,用于分析各种意外偏向的偏向,并提议一套评估指标来量化这种偏向。我们更密切地研究每一个拟议的评价和减轻有毒言论检测偏向性的方法。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

基于大麻素2型受体适配体的靶向多模态光敏剂的基础研究

国家自然科学基金

0+阅读 · 2016年12月31日

社交网络级联数据流异常检测模型研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于压缩感知的通信信号处理理论研究

国家自然科学基金

4+阅读 · 2015年12月31日

扬子鳄环境适应的MHC多样性

国家自然科学基金

0+阅读 · 2014年12月31日

高分辨率遥感影像地类特征的诊断能力分析与选择

国家自然科学基金

0+阅读 · 2013年12月31日

对称陀螺分子高温光谱的理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

旋毛虫副肌球蛋白H-2d限制性Th表位的鉴定及Th-B表位肽嵌合疫苗免疫学效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

神经网络子空间学习算法的收敛性与鲁棒性

国家自然科学基金

1+阅读 · 2009年12月31日

噪声导致的耳蜗毛细胞线粒体损伤及其介导毛细胞死亡的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

仿生水下推进器的神经网络控制方法研究

国家自然科学基金

1+阅读 · 2008年12月31日

A Comprehensive Survey on Graph Anomaly Detection with Deep Learning

Arxiv

1+阅读 · 2022年4月20日

A Survey on Multi-hop Question Answering and Generation

Arxiv

0+阅读 · 2022年4月19日

Rumor Detection with Self-supervised Learning on Texts and Social Graph

Arxiv

0+阅读 · 2022年4月19日

Generalized Out-of-Distribution Detection: A Survey

Generalized Out-of-Distribution Detection: A Survey

Arxiv

15+阅读 · 2021年10月21日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Anomalous Instance Detection in Deep Learning: A Survey

Anomalous Instance Detection in Deep Learning: A Survey

Arxiv

29+阅读 · 2020年3月16日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

80+阅读 · 2020年1月19日

Object Detection in 20 Years: A Survey

Object Detection in 20 Years: A Survey

Arxiv

48+阅读 · 2019年5月13日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

A Comprehensive Survey on Graph Anomaly Detection with Deep Learning

Arxiv

1+阅读 · 2022年4月20日

A Survey on Multi-hop Question Answering and Generation

Arxiv

0+阅读 · 2022年4月19日

Rumor Detection with Self-supervised Learning on Texts and Social Graph

Arxiv

0+阅读 · 2022年4月19日

Generalized Out-of-Distribution Detection: A Survey

Generalized Out-of-Distribution Detection: A Survey

Arxiv

15+阅读 · 2021年10月21日

Towards Open World Object Detection

Arxiv

13+阅读 · 2021年3月3日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Anomalous Instance Detection in Deep Learning: A Survey

Anomalous Instance Detection in Deep Learning: A Survey

Arxiv

29+阅读 · 2020年3月16日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Arxiv

80+阅读 · 2020年1月19日

Object Detection in 20 Years: A Survey

Object Detection in 20 Years: A Survey

Arxiv

48+阅读 · 2019年5月13日

相关基金

基于大麻素2型受体适配体的靶向多模态光敏剂的基础研究

国家自然科学基金

0+阅读 · 2016年12月31日

社交网络级联数据流异常检测模型研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于压缩感知的通信信号处理理论研究

国家自然科学基金

4+阅读 · 2015年12月31日

扬子鳄环境适应的MHC多样性

国家自然科学基金

0+阅读 · 2014年12月31日

高分辨率遥感影像地类特征的诊断能力分析与选择

国家自然科学基金

0+阅读 · 2013年12月31日

对称陀螺分子高温光谱的理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

旋毛虫副肌球蛋白H-2d限制性Th表位的鉴定及Th-B表位肽嵌合疫苗免疫学效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

神经网络子空间学习算法的收敛性与鲁棒性

国家自然科学基金

1+阅读 · 2009年12月31日

噪声导致的耳蜗毛细胞线粒体损伤及其介导毛细胞死亡的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

仿生水下推进器的神经网络控制方法研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员