We extend the fair machine learning literature by considering the problem of proportional centroid clustering in a metric context. For clustering $n$ points with $k$ centers, we define fairness as proportionality to mean that any $n/k$ points are entitled to form their own cluster if there is another center that is closer in distance for all $n/k$ points. We seek clustering solutions to which there are no such justified complaints from any subsets of agents, without assuming any a priori notion of protected subsets. We present and analyze algorithms to efficiently compute, optimize, and audit proportional solutions. We conclude with an empirical examination of the tradeoff between proportional solutions and the $k$-means objective.
翻译:我们扩展了公平机器学习文献的范围,在衡量范围内考虑比例化的机器人集群问题。对于将美元点数与美元中心组合在一起的问题,我们将公平定义为相称性,意味着任何美元/公里点数都有权形成自己的集群,如果还有另一个中心距离接近所有美元/公里点数。我们寻求从任何代理子集中没有任何合理投诉的集群解决方案,而不假定任何先入为主的受保护子集概念。我们提出和分析算法,以便有效地计算、优化和审计比例化解决方案。我们最后对比例解决方案和美元手段目标之间的权衡进行了经验性审查。