We study the problem of learning fair prediction models for unseen test sets distributed differently from the train set. Stability against changes in data distribution is an important mandate for responsible deployment of models. The domain adaptation literature addresses this concern, albeit with the notion of stability limited to that of prediction accuracy. We identify sufficient conditions under which stable models, both in terms of prediction accuracy and fairness, can be learned. Using the causal graph describing the data and the anticipated shifts, we specify an approach based on feature selection that exploits conditional independencies in the data to estimate accuracy and fairness metrics for the test set. We show that for specific fairness definitions, the resulting model satisfies a form of worst-case optimality. In context of a healthcare task, we illustrate the advantages of the approach in making more equitable decisions.
翻译:我们研究的是学习与火车组不同分布的未见试验组的公平预测模型的问题。在数据分配变化方面保持稳定是负责任地部署模型的重要任务。领域适应文献解决了这一关切,尽管稳定性的概念仅限于预测准确性。我们确定足够的条件,使稳定的模型在预测准确性和公平性方面都能学习。我们利用描述数据和预期变化的因果图,说明基于特征选择的方法,利用数据中有条件的不依赖性来估计测试组的准确性和公平度量度。我们表明,对于具体的公平性定义,所产生的模型符合一种最坏情况的最佳性。在保健任务方面,我们说明了这一方法在作出更公平决定方面的优势。