In fully dynamic clustering problems, a clustering of a given data set in a metric space must be maintained while it is modified through insertions and deletions of individual points. In this paper, we resolve the complexity of fully dynamic $k$-center clustering against both adaptive and oblivious adversaries. Against oblivious adversaries, we present the first algorithm for fully dynamic $k$-center in an arbitrary metric space that maintains an optimal $(2+\epsilon)$-approximation in $O(k \cdot \mathrm{polylog}(n,\Delta))$ amortized update time. Here, $n$ is an upper bound on the number of active points at any time, and $\Delta$ is the aspect ratio of the metric space. Previously, the best known amortized update time was $O(k^2\cdot \mathrm{polylog}(n,\Delta))$, and is due to Chan, Gourqin, and Sozio (2018). Moreover, we demonstrate that our runtime is optimal up to $\mathrm{polylog}(n,\Delta)$ factors. In fact, we prove that even offline algorithms for $k$-clustering tasks in arbitrary metric spaces, including $k$-medians, $k$-means, and $k$-center, must make at least $\Omega(n k)$ distance queries to achieve any non-trivial approximation factor. This implies a lower bound of $\Omega(k)$ which holds even for the insertions-only setting. We also show deterministic lower and upper bounds for adaptive adversaries, demonstrate that an update time sublinear in $k$ is possible against oblivious adversaries for metric spaces which admit locally sensitive hash functions (LSH) and give the first fully dynamic $O(1)$-approximation algorithms for the closely related $k$-sum-of-radii and $k$-sum-of-diameter problems.
翻译:在完全动态聚类问题中,需要在度量空间中维护给定数据集的聚类,同时通过逐个点的插入和删除进行修改。在本文中,我们解决了完全动态$k$-中心聚类问题的复杂性,包括面向适应性和匿名对手。面向匿名对手,我们提出了第一个在任意度量空间中的完全动态$k$-中心算法,可以在$O(k \cdot \mathrm{polylog}(n,\Delta))$的均摊更新时间内维护一个最优$(2+\epsilon)$-近似。其中,$n$是任何时间为活跃点的上界,$\Delta$是度量空间的纵横比。以前,已知的最佳均摊更新时间为$O(k^2\cdot \mathrm{polylog}(n,\Delta))$,由Chan、Gourqin和Sozio(2018)提出。此外,我们证明了我们的运行时间除了$\mathrm{polylog}(n,\Delta)$因子以外是最优的。事实上,我们证明了在任意度量空间中解决$k$-聚类任务的离线算法(包括$k$-中位数,$k$-均值和$k$-中心)必须进行至少$\Omega(n k)$次距离查询才能实现任何非平凡近似因子。这意味着即使是在仅插入的情况下,也存在$\Omega(k)$的下界。我们还展示了适应性对手的确定性下界和上界,证明了针对可接受局部敏感哈希函数的度量空间,面向匿名对手可以实现$k$的子线性更新时间。此外,我们还给出了与$k$-半径和$k$-直径问题密切相关的第一个完全动态$O(1)$-近似算法。