We study the problem of $k$-means clustering in the space of straight-line segments in $\mathbb{R}^{2}$ under the Hausdorff distance. For this problem, we give a $(1+\epsilon)$-approximation algorithm that, for an input of $n$ segments, for any fixed $k$, and with constant success probability, runs in time $O(n+ \epsilon^{-O(k)} + \epsilon^{-O(k)}\cdot \log^{O(k)} (\epsilon^{-1}))$. The algorithm has two main ingredients. Firstly, we express the $k$-means objective in our metric space as a sum of algebraic functions and use the optimization technique of Vigneron~\cite{Vigneron14} to approximate its minimum. Secondly, we reduce the input size by computing a small size coreset using the sensitivity-based sampling framework by Feldman and Langberg~\cite{Feldman11, Feldman2020}. Our results can be extended to polylines of constant complexity with a running time of $O(n+ \epsilon^{-O(k)})$.
翻译:暂无翻译