The rapid growth of data in the recent years has led to the development of complex learning algorithms that are often used to make decisions in real world. While the positive impact of the algorithms has been tremendous, there is a need to mitigate any bias arising from either training samples or implicit assumptions made about the data samples. This need becomes critical when algorithms are used in automated decision making systems that can hugely impact people's lives. Many approaches have been proposed to make learning algorithms fair by detecting and mitigating bias in different stages of optimization. However, due to a lack of a universal definition of fairness, these algorithms optimize for a particular interpretation of fairness which makes them limited for real world use. Moreover, an underlying assumption that is common to all algorithms is the apparent equivalence of achieving fairness and removing bias. In other words, there is no user defined criteria that can be incorporated into the optimization procedure for producing a fair algorithm. Motivated by these shortcomings of existing methods, we propose the CONFAIR procedure that produces a fair algorithm by incorporating user constraints into the optimization procedure. Furthermore, we make the process interpretable by estimating the most predictive features from data. We demonstrate the efficacy of our approach on several real world datasets using different fairness criteria.
翻译:近年来数据迅速增长导致复杂的学习算法的发展,这些算法往往被用来在现实世界中作出决定。虽然算法的积极影响是巨大的,但有必要减少培训样本或对数据样本的隐含假设所产生的任何偏差。当在自动决策系统中使用算法,从而能够极大地影响人们的生活时,这种需要就变得至关重要。许多办法都是为了通过发现和减少在优化的不同阶段的偏差来使学习算法变得公平。然而,由于缺乏对公平性的普遍定义,这些算法优化了对公平性的具体解释,从而限制了它们用于真实世界。此外,所有算法的共同基本假设是实现公平和消除偏差的明显等同。换句话说,没有用户确定的标准可以纳入最优化的算法程序。由于现有方法的这些缺点,我们提议了CONFAIR程序,通过将用户的限制因素纳入优化程序来产生公平的算法。此外,我们通过从数据中估算最能预测的特性来解释这个过程。我们用不同的世界数据标准来显示我们不同数据的效能。