In comparison to the numerous debiasing methods proposed for the static non-contextualised word embeddings, the discriminative biases in contextualised embeddings have received relatively little attention. We propose a fine-tuning method that can be applied at token- or sentence-levels to debias pre-trained contextualised embeddings. Our proposed method can be applied to any pre-trained contextualised embedding model, without requiring to retrain those models. Using gender bias as an illustrative example, we then conduct a systematic study using several state-of-the-art (SoTA) contextualised representations on multiple benchmark datasets to evaluate the level of biases encoded in different contextualised embeddings before and after debiasing using the proposed method. We find that applying token-level debiasing for all tokens and across all layers of a contextualised embedding model produces the best performance. Interestingly, we observe that there is a trade-off between creating an accurate vs. unbiased contextualised embedding model, and different contextualised embedding models respond differently to this trade-off.
翻译:与为静态、非翻版的嵌入字形提议的许多贬低方法相比,背景嵌入中的歧视偏见相对较少受到重视。我们建议了一种微调方法,可用于在象征性或句级上降低刻入字形,用于预先培训的背景嵌入。我们提议的方法可以适用于任何预先培训的背景嵌入模型,而无需对这些模型进行再培训。我们以性别偏见作为示例,然后利用多个基准数据集的一些最先进的背景化模型进行系统研究,以评价在使用拟议方法的不同背景化嵌入前后所编码的偏见程度。我们发现,对背景化嵌入模型的所有符号和所有层面都采用象征性的贬低偏差产生了最佳业绩。有趣的是,我们发现在创建准确的和不偏切实际的嵌入模型与不同的背景化嵌入模型对这项交易作出不同反应之间存在着权衡。