Large pre-trained language models are successfully being used in a variety of tasks, across many languages. With this ever-increasing usage, the risk of harmful side effects also rises, for example by reproducing and reinforcing stereotypes. However, detecting and mitigating these harms is difficult to do in general and becomes computationally expensive when tackling multiple languages or when considering different biases. To address this, we present FairDistillation: a cross-lingual method based on knowledge distillation to construct smaller language models while controlling for specific biases. We found that our distillation method does not negatively affect the downstream performance on most tasks and successfully mitigates stereotyping and representational harms. We demonstrate that FairDistillation can create fairer language models at a considerably lower cost than alternative approaches.
翻译:许多语言在多种任务中成功地使用大型的经过培训的语文模式。随着使用率的不断增加,有害副作用的风险也上升,例如通过复制和强化陈规定型观念。然而,一般而言,发现和减轻这些伤害很难做到,在处理多种语言或考虑不同偏见时,很难计算成本。为了解决这个问题,我们介绍了公平提炼:一种基于知识蒸馏的跨语言方法,在控制特定偏差的同时构建较小的语言模式。我们发现,我们的提炼方法不会对下游大多数任务的业绩产生负面影响,并成功减轻陈规定型观念和代表性的伤害。我们证明,公平提炼可以创造比替代方法成本低得多的更公平的语言模式。