The simple idea that not all things are equally difficult has surprising implications when applied in a fairness context. In this work we explore how "difficulty" is model-specific, such that different models find different parts of a dataset challenging. When difficulty correlates with group information, we term this difficulty disparity. Drawing a connection with recent work exploring the inductive bias towards simplicity of SGD-trained models, we show that when such a disparity exists, it is further amplified by commonly-used models. We quantify this amplification factor across a range of settings aiming towards a fuller understanding of the role of model bias. We also present a challenge to the simplifying assumption that "fixing" a dataset is sufficient to ensure unbiased performance.
翻译:简单的观点是,并非所有事物都同样困难,在公平的背景下应用时会产生令人惊讶的影响。 在这项工作中,我们探索“困难”是如何针对具体模型的,这样不同的模型就会发现数据集的不同部分具有挑战性。当困难与群体信息相关时,我们用这一难度来形容这一差异。我们从最近探讨对SGD培训模式简单化的暗示偏差的角度来看,我们发现,当存在这种差异时,通常使用的模型会进一步放大这种差异。我们在一系列环境中量化这一放大系数,目的是更充分地了解模型偏差的作用。我们还对简化的假设提出了挑战,即“固定”数据集足以确保不偏向性的工作。