反事实增加数据和意外偏见:性别歧视和仇恨言论检测案例 (Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection)

Counterfactually Augmented Data (CAD) aims to improve out-of-domain generalizability, an indicator of model robustness. The improvement is credited with promoting core features of the construct over spurious artifacts that happen to correlate with it. Yet, over-relying on core features may lead to unintended model bias. Especially, construct-driven CAD -- perturbations of core features -- may induce models to ignore the context in which core features are used. Here, we test models for sexism and hate speech detection on challenging data: non-hateful and non-sexist usage of identity and gendered terms. In these hard cases, models trained on CAD, especially construct-driven CAD, show higher false-positive rates than models trained on the original, unperturbed data. Using a diverse set of CAD -- construct-driven and construct-agnostic -- reduces such unintended bias.

翻译：反实际增强数据(CAD)旨在改进外向外的一般性,这是模型稳健性的一个指标。改进的功劳主要归功于促进建筑中的核心特征,以弥补偶然与之相关的虚假文物。然而,过度依赖核心特征可能导致无意的模型偏差。特别是,建筑驱动的CAD -- -- 核心特征的扰动 -- -- 可能导致模型忽视使用核心特征的背景。在这里,我们测试了性别主义和仇恨言论检测模式,其依据是具有挑战性的数据:不仇恨和不流行地使用身份和性别术语。在这些困难案例中,关于CAD,特别是建筑驱动的CAD的模型显示的虚假阳性率高于原始、无扰动的数据培训模式。使用多种CAD -- -- 建筑驱动和构建-认知 -- 减少这种意外偏差。

相关内容

CAD

关注 3

《计算机辅助设计》是一份领先的国际期刊，为学术界和工业界提供有关计算机应用于设计的研究和发展的重要论文。计算机辅助设计邀请论文报告新的研究以及新颖或特别重要的应用，在广泛的主题中，跨越所有阶段的设计过程，从概念创造到制造超越。官网地址：http://dblp.uni-trier.de/db/journals/cad/

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日