We study the relationship between task-agnostic intrinsic and task-specific extrinsic social bias evaluation measures for Masked Language Models (MLMs), and find that there exists only a weak correlation between these two types of evaluation measures. Moreover, we find that MLMs debiased using different methods still re-learn social biases during fine-tuning on downstream tasks. We identify the social biases in both training instances as well as their assigned labels as reasons for the discrepancy between intrinsic and extrinsic bias evaluation measurements. Overall, our findings highlight the limitations of existing MLM bias evaluation measures and raise concerns on the deployment of MLMs in downstream applications using those measures.
翻译:我们研究隐形语言模型的任务不可知的内在和任务特有的外部社会偏见评价措施之间的关系,发现这两类评价措施之间的相关性微弱;此外,我们发现,在对下游任务进行微调时,采用不同方法的MLM公司仍会重新消除社会偏见;我们确定,在培训中的社会偏见及其分配标签是内在和外部偏见评价衡量方法之间差异的原因;总体而言,我们的调查结果突出了现有MLM公司偏见评价措施的局限性,并提出了使用这些措施在下游应用MLM公司的问题。