With widening deployments of natural language processing (NLP) in daily life, inherited social biases from NLP models have become more severe and problematic. Previous studies have shown that word embeddings trained on human-generated corpora have strong gender biases that can produce discriminative results in downstream tasks. Previous debiasing methods focus mainly on modeling bias and only implicitly consider semantic information while completely overlooking the complex underlying causal structure among bias and semantic components. To address these issues, we propose a novel methodology that leverages a causal inference framework to effectively remove gender bias. The proposed method allows us to construct and analyze the complex causal mechanisms facilitating gender information flow while retaining oracle semantic information within word embeddings. Our comprehensive experiments show that the proposed method achieves state-of-the-art results in gender-debiasing tasks. In addition, our methods yield better performance in word similarity evaluation and various extrinsic downstream NLP tasks.
翻译:随着自然语言处理(NLP)在日常生活中的日益扩大,从NLP模式中继承下来的社会偏见变得更加严重和棘手。以前的研究显示,在人造公司方面受过训练的字嵌入有很强的性别偏见,可以在下游任务中产生歧视性结果。以前的贬入方法主要侧重于模拟偏见,只是隐含地考虑语义信息,同时完全忽视偏见和语义组成部分之间复杂的内在因果结构。为了解决这些问题,我们提议了一种新颖的方法,利用因果推论框架有效消除性别偏见。拟议的方法使我们能够构建和分析促进性别信息流动的复杂因果机制,同时在文字嵌入中保留手语语语义信息。我们的全面实验表明,拟议的方法在性别偏见任务中取得了最新的结果。此外,我们的方法在文字相似性评价和各种外向下游NLP任务方面产生了更好的表现。