通过变换性合理化,减少有毒语言探测的二分法 (Mitigating Biases in Toxic Language Detection through Invariant Rationalization) - 专知论文

会员服务 ·

0

有偏 · 假正例率 · 不变 · 预测器/决策函数 · 假阳性 ·

2021 年 6 月 14 日

Mitigating Biases in Toxic Language Detection through Invariant Rationalization

翻译：通过变换性合理化,减少有毒语言探测的二分法

Yung-Sung Chuang,Mingye Gao,Hongyin Luo,James Glass,Hung-yi Lee,Yun-Nung Chen,Shang-Wen Li

from arxiv, The 5th Workshop on Online Abuse and Harms at ACL 2021

Automatic detection of toxic language plays an essential role in protecting social media users, especially minority groups, from verbal abuse. However, biases toward some attributes, including gender, race, and dialect, exist in most training datasets for toxicity detection. The biases make the learned models unfair and can even exacerbate the marginalization of people. Considering that current debiasing methods for general natural language understanding tasks cannot effectively mitigate the biases in the toxicity detectors, we propose to use invariant rationalization (InvRat), a game-theoretic framework consisting of a rationale generator and a predictor, to rule out the spurious correlation of certain syntactic patterns (e.g., identity mentions, dialect) to toxicity labels. We empirically show that our method yields lower false positive rate in both lexical and dialectal attributes than previous debiasing methods.

翻译：自动检测有毒语言在保护社交媒体使用者,特别是少数群体免遭口头虐待方面发挥着至关重要的作用,然而,大多数用于检测毒性的培训数据集中都存在对性别、种族和方言等某些属性的偏见。这些偏见使得所学的模型不公平,甚至可能加剧人们的边缘化。考虑到当前一般自然语言理解任务中的贬低方法无法有效减轻毒性检测器的偏差,我们提议使用由理由生成器和预测器组成的游戏理论合理化框架(InvRat),以排除某些合成模式(例如身份提及、方言)与毒性标签之间的虚假关联。我们从经验上看,我们的方法在词汇和方言属性上产生的反正率都低于以往的偏差方法。

0

相关内容

【干货书】面向机器学习的自然语言标注，341页pdf

【干货书】面向机器学习的自然语言标注，341页pdf

专知会员服务

68+阅读 · 2021年2月7日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【图解自监督学习】《The Illustrated Self-Supervised Learning》by Amit Chaudhary

【图解自监督学习】《The Illustrated Self-Supervised Learning》by Amit Chaudhary

专知会员服务

43+阅读 · 2020年2月25日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

【2019 北京智源大会】Recent Breakthroughs in Natural Language Processing（NLP的最新突破） Christopher Manning / 斯坦福人工智能实验室（SAIL）负责人

【2019 北京智源大会】Recent Breakthroughs in Natural Language Processing（NLP的最新突破） Christopher Manning / 斯坦福人工智能实验室（SAIL）负责人

专知会员服务

10+阅读 · 2019年11月1日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

【CMU】机器学习导论课程（Introduction to Machine Learning）

【CMU】机器学习导论课程（Introduction to Machine Learning）

专知会员服务

61+阅读 · 2019年8月26日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

自然语言处理顶会EMNLP2018接受论文列表！

自然语言处理顶会EMNLP2018接受论文列表！

专知

87+阅读 · 2018年8月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Fairness Through Counterfactual Utilities

Fairness Through Counterfactual Utilities

Arxiv

0+阅读 · 2021年8月11日

Hope Speech detection in under-resourced Kannada language

Hope Speech detection in under-resourced Kannada language

Arxiv

0+阅读 · 2021年8月10日

Knowledge Graph Augmented Political Perspective Detection in News Media

Arxiv

0+阅读 · 2021年8月9日

DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced Bengali Language

Arxiv

0+阅读 · 2021年8月6日

Offensive Language and Hate Speech Detection with Deep Learning and Transfer Learning

Arxiv

0+阅读 · 2021年8月6日

Generating Fact Checking Explanations

Generating Fact Checking Explanations

Arxiv

9+阅读 · 2020年4月13日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

Arxiv

6+阅读 · 2018年12月6日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

Neural Models for Key Phrase Detection and Question Generation

Arxiv

4+阅读 · 2018年5月30日

VIP会员

文章信息

相关主题

预测器/决策函数

相关VIP内容

【干货书】面向机器学习的自然语言标注，341页pdf

【干货书】面向机器学习的自然语言标注，341页pdf

专知会员服务

68+阅读 · 2021年2月7日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【图解自监督学习】《The Illustrated Self-Supervised Learning》by Amit Chaudhary

【图解自监督学习】《The Illustrated Self-Supervised Learning》by Amit Chaudhary

专知会员服务

43+阅读 · 2020年2月25日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

【2019 北京智源大会】Recent Breakthroughs in Natural Language Processing（NLP的最新突破） Christopher Manning / 斯坦福人工智能实验室（SAIL）负责人

【2019 北京智源大会】Recent Breakthroughs in Natural Language Processing（NLP的最新突破） Christopher Manning / 斯坦福人工智能实验室（SAIL）负责人

专知会员服务

10+阅读 · 2019年11月1日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

【CMU】机器学习导论课程（Introduction to Machine Learning）

【CMU】机器学习导论课程（Introduction to Machine Learning）

专知会员服务

61+阅读 · 2019年8月26日

热门VIP内容

开通专知VIP会员享更多权益服务

GPT-5如何对齐？从硬性拒绝到安全完成：走向以输出为中心的安全训练

【伯克利博士论文】超越人类监督的视觉智能

【ICCV2025】SO(3) 上连续非保守动力系统的预测

2025年中国数据要素行业发展研究报告

相关资讯

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

自然语言处理顶会EMNLP2018接受论文列表！

自然语言处理顶会EMNLP2018接受论文列表！

专知

87+阅读 · 2018年8月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Fairness Through Counterfactual Utilities

Fairness Through Counterfactual Utilities

Arxiv

0+阅读 · 2021年8月11日

Hope Speech detection in under-resourced Kannada language

Hope Speech detection in under-resourced Kannada language

Arxiv

0+阅读 · 2021年8月10日

Knowledge Graph Augmented Political Perspective Detection in News Media

Arxiv

0+阅读 · 2021年8月9日

DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced Bengali Language

Arxiv

0+阅读 · 2021年8月6日

Offensive Language and Hate Speech Detection with Deep Learning and Transfer Learning

Arxiv

0+阅读 · 2021年8月6日

Generating Fact Checking Explanations

Generating Fact Checking Explanations

Arxiv

9+阅读 · 2020年4月13日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

OMNIA Faster R-CNN: Detection in the wild through dataset merging and soft distillation

Arxiv

6+阅读 · 2018年12月6日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

Neural Models for Key Phrase Detection and Question Generation

Arxiv

4+阅读 · 2018年5月30日

微信扫码咨询专知VIP会员