We study data clustering problems with $\ell_p$-norm objectives (e.g. $k$-Median and $k$-Means) in the context of individual fairness. The dataset consists of $n$ points, and we want to find $k$ centers such that (a) the objective is minimized, while (b) respecting the individual fairness constraint that every point $v$ has a center within a distance at most $r(v)$, where $r(v)$ is $v$'s distance to its $(n/k)$th nearest point. Jung, Kannan, and Lutz [FORC 2020] introduced this concept and designed a clustering algorithm with provable (approximate) fairness and objective guarantees for the $\ell_\infty$ or $k$-Center objective. Mahabadi and Vakilian [ICML 2020] revisited this problem to give a local-search algorithm for all $\ell_p$-norms. Empirically, their algorithms outperform Jung et. al.'s by a large margin in terms of cost (for $k$-Median and $k$-Means), but they incur a reasonable loss in fairness. In this paper, our main contribution is to use Linear Programming (LP) techniques to obtain better algorithms for this problem, both in theory and in practice. We prove that by modifying known LP rounding techniques, one gets a worst-case guarantee on the objective which is much better than in MV20, and empirically, this objective is extremely close to the optimal. Furthermore, our theoretical fairness guarantees are comparable with MV20 in theory, and empirically, we obtain noticeably fairer solutions. Although solving the LP {\em exactly} might be prohibitive, we demonstrate that in practice, a simple sparsification technique drastically improves the run-time of our algorithm.
翻译:在个人公平的背景下,我们研究以美元/美元-美元-美元-美元-美元-美元-美元-美元-美元-美元-美元-美元)的数据集中问题(例如,美元-美元-美元-美元-美元-美元-美元-美元-美元)来研究个人公平性。 数据集由美元-美元-美元-美元-美元-美元-美元-美元-美元-美元)组成,我们想要找到美元-美元中心,以便:(a) 目标最小化,同时(b) 尊重个人公平性限制,即每美元-美元-美元(美元-美元-美元)在距离以美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/