Despite a surge of recent advances in promoting machine Learning (ML) fairness, the existing mainstream approaches mostly require retraining or finetuning the entire weights of the neural network to meet the fairness criteria. However, this is often infeasible in practice for those large-scale trained models due to large computational and storage costs, low data efficiency, and model privacy issues. In this paper, we propose a new generic fairness learning paradigm, called FairReprogram, which incorporates the model reprogramming technique. Specifically, FairReprogram considers the case where models can not be changed and appends to the input a set of perturbations, called the fairness trigger, which is tuned towards the fairness criteria under a min-max formulation. We further introduce an information-theoretic framework that explains why and under what conditions fairness goals can be achieved using the fairness trigger. We show both theoretically and empirically that the fairness trigger can effectively obscure demographic biases in the output prediction of fixed ML models by providing false demographic information that hinders the model from utilizing the correct demographic information to make the prediction. Extensive experiments on both NLP and CV datasets demonstrate that our method can achieve better fairness improvements than retraining-based methods with far less data dependency under two widely-used fairness criteria. Codes are available at https://github.com/UCSB-NLP-Chang/Fairness-Reprogramming.git.
翻译:尽管最近在促进机器学习(ML)公平性方面取得了长足的进步,但现有的主流方法大多需要再培训或微调神经网络的全部重量,以达到公平标准;然而,由于计算和储存成本巨大、数据效率低和隐私问题模型,这些大型训练有素的模式在实践中往往不可行;在本文件中,我们提出了一个新的通用公平学习范式,称为FairReprogram,其中纳入了模式重新规划技术。具体地说,FairReprogram考虑模型无法改变的情况,并在输入时附上一套称为公平触发器的扰动器,即公平触发器,该触发器根据微调公式的公式调整为公平标准。我们进一步引入了一个信息理论框架,解释为什么和在什么条件下可以利用公平性触发器实现公平目标。我们从理论上和从经验上都表明,公平性触发因素可以有效地掩盖固定ML模型产出预测中的人口偏差,提供错误的人口信息妨碍模型利用正确的人口信息进行预测。 NLP/C-RV-regram-Regram-redial real retating)的大规模实验,而不能在两种方法下以更公平性标准下实现。