轮廓系数 (On Parallel k-Center Clustering)

We consider the classic $k$-center problem in a parallel setting, on the low-local-space Massively Parallel Computation (MPC) model, with local space per machine of $\mathcal{O}(n^{\delta})$, where $\delta \in (0,1)$ is an arbitrary constant. As a central clustering problem, the $k$-center problem has been studied extensively. Still, until very recently, all parallel MPC algorithms have been requiring $\Omega(k)$ or even $\Omega(k n^{\delta})$ local space per machine. While this setting covers the case of small values of $k$, for a large number of clusters these algorithms require large local memory, making them poorly scalable. The case of large $k$, $k \ge \Omega(n^{\delta})$, has been considered recently for the low-local-space MPC model by Bateni et al. (2021), who gave an $\mathcal{O}(\log \log n)$-round MPC algorithm that produces $k(1+o(1))$ centers whose cost has multiplicative approximation of $\mathcal{O}(\log\log\log n)$. In this paper we extend the algorithm of Bateni et al. and design a low-local-space MPC algorithm that in $\mathcal{O}(\log\log n)$ rounds returns a clustering with $k(1+o(1))$ clusters that is an $\mathcal{O}(\log^*n)$-approximation for $k$-center.

翻译：我们考虑在低局部空间大规模并行计算（MPC）模型上，以本地空间每台机器 $\mathcal{O}(n^{\delta})$ 的标准来处理经典的 $k$-中心问题。作为一个重要的聚类问题，$k$-中心问题已经得到了广泛的研究。但是，直到最近，所有的并行 MPC 算法都需要每个机器 $\Omega(k)$ 甚至 $\Omega(k n^{\delta})$ 的本地空间。虽然这种设置涵盖了小值 $k$ 的情况，但是对于大量聚类，这些算法需要大量的本地内存，使得它们的可扩展性很差。对于较大的 $k$, $k\ge \Omega(n^{\delta})$ 的情况，Bateni 等人（2021）最近考虑了低局部空间 MPC 模型，并给出了一种 $\mathcal{O}(\log \log n)$-轮 MPC 算法，该算法返回 $k(1+o(1))$ 个中心，其代价的乘性近似为 $\mathcal{O}(\log\log\log n)$。在本文中，我们扩展了 Bateni 等人的算法，并设计了一种低局部空间 MPC 算法，在 $\mathcal{O}(\log\log n)$ 轮中返回了一个具有 $k(1+o(1))$ 个聚类的聚类，该聚类是 $k$-中心问题的 $\mathcal{O}(\log^*n)$-近似解。

相关内容

Omega

关注 0

在Omega中，资源发放是乐观的(optimistic)，每一个应用都发放了所有的可用的资源，冲突是在提交的时候被解决的。Omega的资源管理器，本质上是一个保存着每一个节点的状态关系数据库，并且用不同的乐观并发控制来解决冲突。这样的好处是其大大的提高了调度器的性能(完全的并行，full parallelism)和资源利用率。

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

专知会员服务

57+阅读 · 2020年3月13日