Models trained from real-world data tend to imitate and amplify social biases. Although there are many methods suggested to mitigate biases, they require a preliminary information on the types of biases that should be mitigated (e.g., gender or racial bias) and the social groups associated with each data sample. In this work, we propose a debiasing method that operates without any prior knowledge of the demographics in the dataset, detecting biased examples based on an auxiliary model that predicts the main model's success and down-weights them during the training process. Results on racial and gender bias demonstrate that it is possible to mitigate social biases without having to use a costly demographic annotation process.
翻译:从真实世界数据中培训的模型往往模仿和扩大社会偏见,虽然建议采取许多方法来减少偏见,但它们需要初步资料,说明应减轻的偏见类型(如性别或种族偏见)和与每个数据抽样有关的社会群体,在这项工作中,我们建议采用一种贬低偏见的方法,在不事先了解数据集中的人口统计的情况下运作,根据预测主要模型成功与否的辅助模型发现有偏见的例子,在培训过程中降低其重量,关于种族和性别偏见的结果表明,在不必使用昂贵的人口统计说明程序的情况下,可以减少社会偏见。