It is of critical importance to be aware of the historical discrimination embedded in the data and to consider a fairness measure to reduce bias throughout the predictive modeling pipeline. Various notions of fairness have been defined, though choosing an appropriate metric is cumbersome. Trade-offs and impossibility theorems make such selection even more complicated and controversial. In practice, users (perhaps regular data scientists) should understand each of the measures and (if possible) manually explore the combinatorial space of different measures before they can decide which combination is preferred based on the context, the use case, and regulations. To alleviate the burden of selecting fairness notions for consideration, we propose a framework that automatically discovers the correlations and trade-offs between different pairs of measures for a given context. Our framework dramatically reduces the exploration space by finding a small subset of measures that represent others and highlighting the trade-offs between them. This allows users to view unfairness from various perspectives that might otherwise be ignored due to the sheer size of the exploration space. We showcase the validity of the proposal using comprehensive experiments on real-world benchmark data sets.
翻译:至关重要的是,了解数据中的历史歧视,考虑公平措施以减少整个预测模型管道中的偏见。各种公平概念已经确定,尽管选择适当的衡量标准是累赘的。权衡和不可能的理论使得这种选择更加复杂和争议性更大。在实践中,用户(也许定期的数据科学家)应该了解其中的每一项措施,并(如果可能的话)手工探索不同措施的组合空间,然后才能根据背景、使用案例和规章来决定哪些组合更可取。为了减轻选择公平概念供考虑的负担,我们提议了一个框架,自动发现不同措施之间在特定情况下的相互关系和权衡。我们的框架通过找到代表他人的一小部分措施和强调它们之间的权衡,大大减少了勘探空间。这使用户能够从各种角度看待不公平,否则可能因勘探空间的大小而被忽视。我们用对现实世界基准数据集的全面实验来展示该提案的有效性。