In this work, we explore the unique challenges -- and opportunities -- of unsupervised federated learning (FL). We develop and analyze a one-shot federated clustering scheme, $k$-FED, based on the widely-used Lloyd's method for $k$-means clustering. In contrast to many supervised problems, we show that the issue of statistical heterogeneity in federated networks can in fact benefit our analysis. We analyse $k$-FED under a center separation assumption and compare it to the best known requirements of its centralized counterpart. Our analysis shows that in heterogeneous regimes where the number of clusters per device $(k')$ is smaller than the total number of clusters over the network $k$, $(k'\le \sqrt{k})$, we can use heterogeneity to our advantage -- significantly weakening the cluster separation requirements for $k$-FED. From a practical viewpoint, $k$-FED also has many desirable properties: it requires only round of communication, can run asynchronously, and can handle partial participation or node/network failures. We motivate our analysis with experiments on common FL benchmarks, and highlight the practical utility of one-shot clustering through use-cases in personalized FL and device sampling.
翻译:在这项工作中,我们探索了未经监督的联合会学习(FL)的独特挑战 -- -- 和机会。我们根据劳埃德广泛使用的美元汇率组合方法,制定并分析了一次性联合集群计划(K$-FED ) 。与许多受监督的问题相比,我们表明,联邦网络的统计差异问题实际上有利于我们的分析。我们在中心分离假设下分析美元-FED,并将其与其最已知的中央对应方需求进行比较。我们的分析表明,在各种制度中,每台设备(k)的集群数量小于网络的集群总数(k$-k$-kED ) 、 美元(k'\le\ sqrt{k} ) 。我们可以利用异质性来帮助我们优势 -- -- 大大削弱对美元-FED的集群隔离要求。从实际的角度看,美元-FED也有许多可取的特性:它只需要一轮通信,可以同步运行,并且能够处理部分参与或节点/节点/节点组合的组合,通过常规的FBI-L 实验,我们通过共同的FD-BS-BS-BS-BS-BS-S-BS-S-S-S-S-S-G-S-S-S-S-S-S-S-S-S-S-S-S-BAR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SBAR-S-SDAR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-F-S-S-S-S-S-F-S-S-S-S