With the growing interest in the machine learning community to solve real-world problems, it has become crucial to uncover the hidden reasoning behind their decisions by focusing on the fairness and auditing the predictions made by these black-box models. In this paper, we propose a novel method to address two key issues: (a) Can we simultaneously learn fair disentangled representations while ensuring the utility of the learned representation for downstream tasks, and (b)Can we provide theoretical insights into when the proposed approach will be both fair and accurate. To address the former, we propose the method FRIED, Fair Representation learning using Interpolation Enabled Disentanglement. In our architecture, by imposing a critic-based adversarial framework, we enforce the interpolated points in the latent space to be more realistic. This helps in capturing the data manifold effectively and enhances the utility of the learned representation for downstream prediction tasks. We address the latter question by developing a theory on fairness-accuracy trade-offs using classifier-based conditional mutual information estimation. We demonstrate the effectiveness of FRIED on datasets of different modalities - tabular, text, and image datasets. We observe that the representations learned by FRIED are overall fairer in comparison to existing baselines and also accurate for downstream prediction tasks. Additionally, we evaluate FRIED on a real-world healthcare claims dataset where we conduct an expert aided model auditing study providing useful insights into opioid ad-diction patterns.
翻译:随着人们对机器学习界解决现实世界问题的兴趣日益浓厚,通过注重公平并审计这些黑盒模型的预测,发现其决定背后隐藏的推理变得至关重要。在本文件中,我们提出了解决两个关键问题的新颖方法:(a) 我们能否同时学习公平、分解的表述方式,同时确保为下游任务提供有知识的代表方式;(b) 我们能否提供理论见解,说明何时拟议的方法既公平又准确。为了应对前者,我们提议FRIED方法,即利用 International Conflicting Condition的公平代表学习方式。在我们的结构中,我们通过强加一个基于批评的对立框架,将潜在空间的内插点执行得更现实一些。这有助于以多种方式收集数据,提高下游预测任务中所学到的代表性的效用。我们通过利用基于分类的有条件的相互信息估计,就公平、准确的取舍(FRIE)交换不同模式的数据集的有效性提出建议。我们观察到,我们通过以基于表格、文本和图像数据集的对比方式,将《准则》下游评估了我们所了解的《准则》下游数据的准确度。