Motivated by practical generalizations of the classic $k$-median and $k$-means objectives, such as clustering with size constraints, fair clustering, and Wasserstein barycenter, we introduce a meta-theorem for designing coresets for constrained-clustering problems. The meta-theorem reduces the task of coreset construction to one on a bounded number of ring instances with a much-relaxed additive error. This reduction enables us to construct coresets using uniform sampling, in contrast to the widely-used importance sampling, and consequently we can easily handle constrained objectives. Notably and perhaps surprisingly, this simpler sampling scheme can yield coresets whose size is independent of $n$, the number of input points. Our technique yields smaller coresets, and sometimes the first coresets, for a large number of constrained clustering problems, including capacitated clustering, fair clustering, Euclidean Wasserstein barycenter, clustering in minor-excluded graph, and polygon clustering under Fr\'{e}chet and Hausdorff distance. Finally, our technique yields also smaller coresets for $1$-median in low-dimensional Euclidean spaces, specifically of size $\tilde{O}(\varepsilon^{-1.5})$ in $\mathbb{R}^2$ and $\tilde{O}(\varepsilon^{-1.6})$ in $\mathbb{R}^3$.
翻译:受经典的美元中值和美元中值和美元中值目标(如规模限制的集群、公平的集群和瓦塞斯坦中值中心)的实际概括性驱动,我们引入了用于设计限制集中问题核心群的元理论。元理论将核心集构建的任务降低到一个有大松散的添加错误的捆绑数上。这一减少使我们能够使用统一的取样来构建核心群,这与广泛使用的重要取样相比,因此我们很容易处理受限制的目标。值得注意的是,也许令人惊讶的是,这一更简单的取样方案可以产生其大小独立于美元、输入点数的核心小组。我们的技术产生较小的核心群,有时是第一个核心群,因为大量受限制的集群问题,包括增强的集群、公平的集群、Euclidean Valstein Barenter,在Fr'{echchechet和Haustflef 距离下的微型组合。最后,我们的技术在美元中也产生较低的核心空间,在美元中,在美元中。