We study coresets for clustering with capacity and fairness constraints. Our main result is a near-linear time algorithm to construct $\tilde{O}(k^2\varepsilon^{-2z-2})$-sized $\varepsilon$-coresets for capacitated $(k,z)$-clustering which improves a recent $\tilde{O}(k^3\varepsilon^{-3z-2})$ bound by [BCAJ+22, HJLW23]. As a corollary, we also save a factor of $k \varepsilon^{-z}$ on the coreset size for fair $(k,z)$-clustering compared to them. We fundamentally improve the hierarchical uniform sampling framework of [BCAJ+22] by adaptively selecting sample size on each ring instance, proportional to its clustering cost to an optimal solution. Our analysis relies on a key geometric observation that reduces the number of total ``effective centers" from [BCAJ+22]'s $\tilde{O}(k^2\varepsilon^{-z})$ to merely $O(k\log \varepsilon^{-1})$ by being able to ``ignore'' all center points that are too far or too close to the ring center.
翻译:我们研究的是具有能力和公平性限制的集群核心。 我们的主要结果是一个近线性时间算法, 用于构建 $\ tilde{O}( k ⁇ 2\ varepsilon}\\ 2z-2}) (k, z) $- 集合, 使最新的 $( tilde{ O} (k, 3\ varepsilon} 3z-2} (k ⁇ 3\ varepsilon} 3z-2} ) 受 [BCAJ+22, HJLW23] 约束的组合。 作为必然结果, 我们还在核心设定的大小上节省了 $(k,z) 美元($) 和 美元( 美元) 的( varepsilon) 核心上的一个系数。 我们从根本上改进了[ BCAJJ+22] 的等级统一取样框架, 在每个圆柱形实例中, 与它的组合成本和最佳解决方案成比例。 我们的分析依赖于一个关键的几何观察, 将总“ 有效中心” 从 [BCAJ+22] 的 $\\\\\\\\\\\\\\\\\ revilent crent crentrentral cral cral centrus