We study $k$-clustering problems with lower bounds, including $k$-median and $k$-means clustering with lower bounds. In addition to the point set $P$ and the number of centers $k$, a $k$-clustering problem with (uniform) lower bounds gets a number $B$. The solution space is restricted to clusterings where every cluster has at least $B$ points. We demonstrate how to approximate $k$-median with lower bounds via a reduction to facility location with lower bounds, for which $O(1)$-approximation algorithms are known. Then we propose a new constrained clustering problem with lower bounds where we allow points to be assigned multiple times (to different centers). This means that for every point, the clustering specifies a set of centers to which it is assigned. We call this clustering with weak lower bounds. We give a $(6.5+\epsilon)$-approximation for $k$-median clustering with weak lower bounds and an $O(1)$-approximation for $k$-means with weak lower bounds. We conclude by showing that at a constant increase in the approximation factor, we can restrict the number of assignments of every point to $2$ (or, if we allow fractional assignments, to $1+\epsilon$). This also leads to the first bicritera approximation algorithm for $k$-means with (standard) lower bounds where bicriteria is interpreted in the sense that the lower bounds are violated by a constant factor. All algorithms in this paper run in time that is polynomial in $n$ and $k$ (and $d$ for the Euclidean variants considered).
翻译:我们研究下界值较低的组合问题,包括中值美元和中值美元,下界值较低的组合。除了设定点数和中值美元和中值美元外,下界值较低的组合问题为美元。除了设定点数和中值美元外,下界值较低的组合问题为美元。解决方案空间仅限于每个集体至少有至少B美元点的组合。我们展示了如何通过降低限值较低的设施地点削减标准值,以中值为中值为中值,中值为中值,中值为中值,中值为中值,中值为中值,中值为中值,中值为中值1美元,准值为中值美元。然后,我们提出新的限值标准组合问题,下界值为美元,我们允许多次指定点数(到不同的中心点),每点都指定一组中心值。我们称之为下界值的组合。我们给出了美元(6.5 ⁇ )美元,正值的中值最高值为美元,中值的中值是低限值的中值,而正值为美元。我们以美元为基值的正值的基值值值值值值为美元,因此,正值的中值值的中值值值值值值值值值值将值将值值值将值值值值值值值值。