改进反事实生成,以发现公平的仇恨言论 (Improving Counterfactual Generation for Fair Hate Speech Detection) - 专知论文

会员服务 ·

0

Facebook AI Research · MoDELS · 可约的 · 对数几率 · Performer ·

2021 年 8 月 3 日

Improving Counterfactual Generation for Fair Hate Speech Detection

翻译：改进反事实生成,以发现公平的仇恨言论

Aida Mostafazadeh Davani,Ali Omrani,Brendan Kennedy,Mohammad Atari,Xiang Ren,Morteza Dehghani

from arxiv, 4-page paper Workshop on Online Abuse and Harms (ACL 2021)

Bias mitigation approaches reduce models' dependence on sensitive features of data, such as social group tokens (SGTs), resulting in equal predictions across the sensitive features. In hate speech detection, however, equalizing model predictions may ignore important differences among targeted social groups, as hate speech can contain stereotypical language specific to each SGT. Here, to take the specific language about each SGT into account, we rely on counterfactual fairness and equalize predictions among counterfactuals, generated by changing the SGTs. Our method evaluates the similarity in sentence likelihoods (via pre-trained language models) among counterfactuals, to treat SGTs equally only within interchangeable contexts. By applying logit pairing to equalize outcomes on the restricted set of counterfactuals for each instance, we improve fairness metrics while preserving model performance on hate speech detection.

翻译：减轻偏见的办法减少了模式对敏感数据特征的依赖,如社会团体标牌(SGTs),从而导致敏感特征之间的预测平等。然而,在发现仇恨言论时,平衡模型预测可能忽略目标社会群体之间的重大差异,因为仇恨言论可能包含针对每个社会团体的陈规定型语言。这里,为了考虑到每个社会团体的具体语言,我们依靠的是反事实公平,并平衡对反事实的预测,而改变社会团体标牌(SGTs)则产生反事实的预测。我们的方法评估了反事实(通过经过事先训练的语言模型)的判刑可能性的相似性,只有可互换的环境才能同等对待特殊社会群体。我们通过对日志配对,在每种情况下限制的一套反事实上实现结果的平等性,我们改进了公平衡量标准,同时保留了仇恨言论检测的示范性表现。

0

相关内容

Facebook AI Research

Facebook AI Research

Facebook AI Research

【AAAI2021】通过知识到文本转换来测试知识增强的常识性问题回答

【AAAI2021】通过知识到文本转换来测试知识增强的常识性问题回答

专知会员服务

29+阅读 · 2021年1月17日

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

专知会员服务

9+阅读 · 2020年4月17日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【斯坦福大学】领域自适应小样本生成（DAWSON: A Domain Adaptive Few Shot Generation Framework）

【斯坦福大学】领域自适应小样本生成（DAWSON: A Domain Adaptive Few Shot Generation Framework）

专知会员服务

36+阅读 · 2020年1月7日

【ICCV2019教程】物体检测的R-CNN通用框架，The Generalized R-CNN Framework for Object Detection，180页ppt，Facebook 人工智能研究院Ross Girshick大神

【ICCV2019教程】物体检测的R-CNN通用框架，The Generalized R-CNN Framework for Object Detection，180页ppt，Facebook 人工智能研究院Ross Girshick大神

专知会员服务

25+阅读 · 2019年11月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

已删除

将门创投

5+阅读 · 2019年4月29日

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

专知

22+阅读 · 2018年4月21日

A Legal Approach to Hate Speech -- Operationalizing the EU's Legal Framework against the Expression of Hatred as an NLP Task

A Legal Approach to Hate Speech -- Operationalizing the EU's Legal Framework against the Expression of Hatred as an NLP Task

Arxiv

0+阅读 · 2021年10月5日

Federating for Learning Group Fair Models

Arxiv

0+阅读 · 2021年10月5日

xFAIR: Better Fairness via Model-based Rebalancing of Protected Attributes

Arxiv

0+阅读 · 2021年10月3日

Adversarial Examples Generation for Reducing Implicit Gender Bias in Pre-trained Models

Arxiv

0+阅读 · 2021年10月3日

Enhancing Model Robustness and Fairness with Causality: A Regularization Approach

Arxiv

0+阅读 · 2021年10月3日

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

Arxiv

3+阅读 · 2021年1月29日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

Facebook AI Research

相关VIP内容

【AAAI2021】通过知识到文本转换来测试知识增强的常识性问题回答

【AAAI2021】通过知识到文本转换来测试知识增强的常识性问题回答

专知会员服务

29+阅读 · 2021年1月17日

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

专知会员服务

9+阅读 · 2020年4月17日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【斯坦福大学】领域自适应小样本生成（DAWSON: A Domain Adaptive Few Shot Generation Framework）

【斯坦福大学】领域自适应小样本生成（DAWSON: A Domain Adaptive Few Shot Generation Framework）

专知会员服务

36+阅读 · 2020年1月7日

【ICCV2019教程】物体检测的R-CNN通用框架，The Generalized R-CNN Framework for Object Detection，180页ppt，Facebook 人工智能研究院Ross Girshick大神

【ICCV2019教程】物体检测的R-CNN通用框架，The Generalized R-CNN Framework for Object Detection，180页ppt，Facebook 人工智能研究院Ross Girshick大神

专知会员服务

25+阅读 · 2019年11月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

已删除

将门创投

5+阅读 · 2019年4月29日

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

专知

22+阅读 · 2018年4月21日

相关论文

A Legal Approach to Hate Speech -- Operationalizing the EU's Legal Framework against the Expression of Hatred as an NLP Task

A Legal Approach to Hate Speech -- Operationalizing the EU's Legal Framework against the Expression of Hatred as an NLP Task

Arxiv

0+阅读 · 2021年10月5日

Federating for Learning Group Fair Models

Arxiv

0+阅读 · 2021年10月5日

xFAIR: Better Fairness via Model-based Rebalancing of Protected Attributes

Arxiv

0+阅读 · 2021年10月3日

Adversarial Examples Generation for Reducing Implicit Gender Bias in Pre-trained Models

Arxiv

0+阅读 · 2021年10月3日

Enhancing Model Robustness and Fairness with Causality: A Regularization Approach

Arxiv

0+阅读 · 2021年10月3日

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

Arxiv

3+阅读 · 2021年1月29日

Improved Speech Enhancement with the Wave-U-Net

Arxiv

8+阅读 · 2018年11月27日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Arxiv

18+阅读 · 2018年1月5日

微信扫码咨询专知VIP会员