We conduct an exploratory study that looks at incorporating John Rawls' ideas on fairness into existing unsupervised machine learning algorithms. Our focus is on the task of clustering, specifically the k-means clustering algorithm. To the best of our knowledge, this is the first work that uses Rawlsian ideas in clustering. Towards this, we attempt to develop a postprocessing technique i.e., one that operates on the cluster assignment generated by the standard k-means clustering algorithm. Our technique perturbs this assignment over a number of iterations to make it fairer according to Rawls' difference principle while minimally affecting the overall utility. As the first step, we consider two simple perturbation operators -- $\mathbf{R_1}$ and $\mathbf{R_2}$ -- that reassign examples in a given cluster assignment to new clusters; $\mathbf{R_1}$ assigning a single example to a new cluster, and $\mathbf{R_2}$ a pair of examples to new clusters. Our experiments on a sample of the Adult dataset demonstrate that both operators make meaningful perturbations in the cluster assignment towards incorporating Rawls' difference principle, with $\mathbf{R_2}$ being more efficient than $\mathbf{R_1}$ in terms of the number of iterations. However, we observe that there is still a need to design operators that make significantly better perturbations. Nevertheless, both operators provide good baselines for designing and comparing any future operator, and we hope our findings would aid future work in this direction.
翻译:我们进行一项探索性研究,研究将约翰·罗尔斯关于公平性的想法纳入现有的不受监督的机器学习算法。 我们的侧重点是分组任务, 特别是 k- poles 群集算法。 根据我们的知识, 这是在分组中使用罗尔西亚想法的第一个工作。 为此, 我们试图开发后处理技术, 也就是在标准 k- poil 群集算法产生的群集任务上运行的一个。 我们的技术通过一些迭代来破坏这项任务, 以便根据Rawls 差异原则使其更加公平, 同时又对总体效用影响最小。 作为第一步, 我们考虑两个简单的围绕运操作员 -- $\ mathf{R_ 1} 和$\ mathb{R_ 2} $ -- 在给某个特定群集指派中重新标注示例; $mathb{ { R_ 1} $( 美元) 给一个新的群集, $\ fread{ R_ b) 提供一对新群集的希望的一对一对一对一例的示例。 我们在一个样本上进行实验, 将一个更精确的运算的操作员 。