In this paper, we propose a general framework for mitigating the disparities of the predicted classes with respect to secondary attributes within the data (e.g., race, gender etc.). Our proposed method involves learning a multi-objective function that in addition to learning the primary objective of predicting the primary class labels from the data, also employs a clustering-based heuristic to minimize the disparities of the class label distribution with respect to the cluster memberships, with the assumption that each cluster should ideally map to a distinct combination of attribute values. Experiments demonstrate effective mitigation of cognitive biases on a benchmark dataset without the use of annotations of secondary attribute values (the zero-shot case) or with the use of a small number of attribute value annotations (the few-shot case).
翻译:在本文中,我们提出一个总体框架,以缩小预测的类别在数据内与次级属性(例如种族、性别等)之间的差距。我们提议的方法包括学习一个多目标功能,除了学习从数据中预测初级类别标签的首要目标外,还采用基于集群的杂务法,以尽量减少分类标签分配与分组成员成员之间的差距,并假设每个组群最好应绘制不同的属性值组合图。实验表明,在不使用次级属性值说明(零点情况)或使用少量属性值说明(少点情况)的情况下,可以有效减少基准数据集的认知偏差。