Ensuring fairness in machine learning algorithms is a challenging and important task. We consider the problem of clustering a set of points while ensuring fairness constraints. While there have been several attempts to capture group fairness in the k-clustering problem, fairness at an individual level is not well-studied. We introduce a new notion of individual fairness in k-clustering based on features that are not necessarily used for clustering. We show that this problem is NP-hard and does not admit a constant factor approximation. We then design a randomized algorithm that guarantees approximation both in terms of minimizing the clustering distance objective as well as individual fairness under natural restrictions on the distance metric and fairness constraints. Finally, our experimental results validate that our algorithm produces lower clustering costs compared to existing algorithms while being competitive in individual fairness.
翻译:确保机器学习算法的公平性是一项具有挑战性和重要的任务。 我们认为,在确保公平性限制的同时,将一组点分组的问题是一个具有挑战性的重要任务。 虽然在k类集问题中曾几次试图抓住群体公平性,但对个人层面的公平性没有很好地研究。 我们根据不一定用于集群的特征,在k类集中引入了个人公平的新概念。 我们表明,这个问题是NP硬的,并不接受一个恒定的系数近似值。 然后我们设计一种随机化算法,保证在限制距离度和公平性限制的自然限制下,尽可能减少集群的距离目标和个人公平性。 最后,我们的实验结果证实,我们的算法比现有的算法产生较低的集群成本,同时在个人公平性方面具有竞争力。