We consider the classic $k$-center problem in a parallel setting, on the low-local-space Massively Parallel Computation (MPC) model, with local space per machine of $\mathcal{O}(n^{\delta})$, where $\delta \in (0,1)$ is an arbitrary constant. As a central clustering problem, the $k$-center problem has been studied extensively. Still, until very recently, all parallel MPC algorithms have been requiring $\Omega(k)$ or even $\Omega(k n^{\delta})$ local space per machine. While this setting covers the case of small values of $k$, for a large number of clusters these algorithms require large local memory, making them poorly scalable. The case of large $k$, $k \ge \Omega(n^{\delta})$, has been considered recently for the low-local-space MPC model by Bateni et al. (2021), who gave an $\mathcal{O}(\log \log n)$-round MPC algorithm that produces $k(1+o(1))$ centers whose cost has multiplicative approximation of $\mathcal{O}(\log\log\log n)$. In this paper we extend the algorithm of Bateni et al. and design a low-local-space MPC algorithm that in $\mathcal{O}(\log\log n)$ rounds returns a clustering with $k(1+o(1))$ clusters that is an $\mathcal{O}(\log^*n)$-approximation for $k$-center.
翻译:我们考虑在低局部空间大规模并行计算(MPC)模型上,以本地空间每台机器 $\mathcal{O}(n^{\delta})$ 的标准来处理经典的 $k$-中心问题。作为一个重要的聚类问题,$k$-中心问题已经得到了广泛的研究。但是,直到最近,所有的并行 MPC 算法都需要每个机器 $\Omega(k)$ 甚至 $\Omega(k n^{\delta})$ 的本地空间。虽然这种设置涵盖了小值 $k$ 的情况,但是对于大量聚类,这些算法需要大量的本地内存,使得它们的可扩展性很差。对于较大的 $k$, $k\ge \Omega(n^{\delta})$ 的情况,Bateni 等人(2021)最近考虑了低局部空间 MPC 模型,并给出了一种 $\mathcal{O}(\log \log n)$-轮 MPC 算法,该算法返回 $k(1+o(1))$ 个中心,其代价的乘性近似为 $\mathcal{O}(\log\log\log n)$。在本文中,我们扩展了 Bateni 等人的算法,并设计了一种低局部空间 MPC 算法,在 $\mathcal{O}(\log\log n)$ 轮中返回了一个具有 $k(1+o(1))$ 个聚类的聚类,该聚类是 $k$-中心问题的 $\mathcal{O}(\log^*n)$-近似解。