Mitigating bias in training on biased datasets is an important open problem. Several techniques have been proposed, however the typical evaluation regime is very limited, considering very narrow data conditions. For instance, the effect of target class imbalance and stereotyping is under-studied. To address this gap, we examine the performance of various debiasing methods across multiple tasks, spanning binary classification (Twitter sentiment), multi-class classification (profession prediction), and regression (valence prediction). Through extensive experimentation, we find that data conditions have a strong influence on relative model performance, and that general conclusions cannot be drawn about method efficacy when evaluating only on standard datasets, as is current practice in fairness research.
翻译:减少对有偏向数据集培训的偏差是一个尚未解决的重要问题。 已经提出了几种技术,但考虑到数据条件极为狭窄,典型的评价制度非常有限。例如,目标类别不平衡和陈规定型观念的影响研究不足。为了弥补这一差距,我们审视了多种任务中各种贬低方法的绩效,包括二元分类(Twitter情绪 ) 、 多级分类(Profession 预测) 和回归(valence 预测 ) 。 通过广泛的实验,我们发现数据条件对相对模型性能有很强的影响,而且仅仅对标准数据集进行评估时无法得出方法有效性的一般性结论,正如公平研究中目前的做法一样。