We study $k$-clustering problems with lower bounds, including $k$-median and $k$-means clustering with lower bounds. In addition to the point set $P$ and the number of centers $k$, a $k$-clustering problem with (uniform) lower bounds gets a number $B$. The solution space is restricted to clusterings where every cluster has at least $B$ points. We demonstrate how to approximate $k$-median with lower bounds via a reduction to facility location with lower bounds, for which $O(1)$-approximation algorithms are known. Then we propose a new constrained clustering problem with lower bounds where we allow points to be assigned multiple times (to different centers). This means that for every point, the clustering specifies a set of centers to which it is assigned. We call this clustering with weak lower bounds. We give an $8$-approximation for $k$-median clustering with weak lower bounds and an $O(1)$-approximation for $k$-means with weak lower bounds. We conclude by showing that at a constant increase in the approximation factor, we can restrict the number of assignments of every point to $2$ (or, if we allow fractional assignments, to $1+\epsilon$). This also leads to the first bicritera approximation algorithm for $k$-means with (standard) lower bounds where bicriteria is interpreted in the sense that the lower bounds are violated by a constant factor. All algorithms in this paper run in time that is polynomial in $n$ and $k$ (and $d$ for the Euclidean variants considered).
翻译:我们研究的基美元集群问题与下界关系较低,包括中值美元和中值美元,下界关系较低。除了设定点数,美元和中值组合;除了设定点数,美元和中值组合问题外,下界(统一)的美元集群问题获得美元美元美元。解决方案空间仅限于每个组至少拥有B美元点的组群。我们展示了如何通过降低限值较低的设施地点,降低限值,以中值为中值组合,以中值为中值组合,以中值为中值。然后,我们提出了新的限值组合问题,在允许多度分配点(向不同的中心)的下限范围内,以美元为内值组合问题。这意味着,对于每个点,组群都指定了一组中心。我们称之为每组的基值组合,但下界值至少至少为美元。我们给美元中间值组合的8美元,其下界值较低端点为美元;对于美元基值的基值计算,则以美元为美元比值,以美元比值调整。对于基值较低的基值值为美元值的基值值值,我们的结论是每下限的基值排序的基值将使用一个固定值,如果每个值计算,则以欧元的基值计算,以每值计算,以美元。(我们以美元计算,在每值的基值的基值的基值计算出一个基值为每值的基值,以正值)在每值数值,则以正值数中,在每值计算,以中以中,以每值计算,以每值计算一个基值数值。