Class imbalance is a common challenge in many NLP tasks, and has clear connections to bias, in that bias in training data often leads to higher accuracy for majority groups at the expense of minority groups. However there has traditionally been a disconnect between research on class-imbalanced learning and mitigating bias, and only recently have the two been looked at through a common lens. In this work we evaluate long-tail learning methods for tweet sentiment and occupation classification, and extend a margin-loss based approach with methods to enforce fairness. We empirically show through controlled experiments that the proposed approaches help mitigate both class imbalance and demographic biases.
翻译:班级不平衡是许多全国劳工规划任务的共同挑战,与偏见有着明显的联系,因为培训数据的偏向往往导致多数群体的准确性提高,而牺牲少数群体的利益。然而,传统上,关于班级平衡学习的研究与减少偏向的研究之间一直脱节,而且直到最近才通过共同的视角来看待这两个问题。 在这项工作中,我们评估了推文情绪和职业分类的长途学习方法,并推广了基于差值的方法,以实行公平。 我们通过有控制的实验,经验显示,拟议方法有助于缓解班级不平衡和人口偏向。