Federated clustering (FedC) is an adaptation of centralized clustering in federated settings, which aims to cluster data based on a global similarity measure while keeping all data locally. Two of the main challenges of FedC are the non-identically and independently distributed (non-i.i.d.) nature of data across different sources, as well as the need for privacy protection. In this paper, we propose a differentially private federated clustering (DP-FedC) algorithm to deal with these challenges. Unlike most existing algorithms without considering privacy, the proposed DP-FedC algorithm is designed to handle non-convex and non-smooth problems by using differential privacy techniques to guarantee privacy, together with privacy amplification assisted tradeoff between learning performance and privacy protection. Then some theoretical analyses of the performance and privacy of the proposed DP-FedC are presented, showing the impact of privacy protection, data heterogeneity, and partial client participation on learning performance. Finally, some experimental results are presented to demonstrate the efficacy (including analytical results) of the proposed DP-FedC algorithm together with its superior performance over state-of-the-art approaches.
翻译:联邦组群(FedC)是联邦环境中集中集束的调整,目的是根据全球类似计量标准将数据分组,同时在当地保存所有数据。联邦组群(FedC)的两个主要挑战是不同来源的数据的非身份和独立分布(非i.i.d.)性质,以及保护隐私的必要性。在本文件中,我们建议采用不同的私人联合集聚(DP-FedC)算法来应对这些挑战。与大多数现有的算法不同,不考虑隐私,拟议的DP-FedC算法旨在通过使用不同的隐私技术来保障隐私,同时利用隐私简化隐私技术协助学习绩效和隐私保护之间的平衡,处理非混杂和非疏松问题。随后,对拟议的DP-FedC的绩效和隐私进行了一些理论分析,显示了隐私保护、数据异质和部分客户参与对学习绩效的影响。最后,提出了一些实验结果,以展示拟议的DP-FedC算法的功效(包括分析结果)及其优异性业绩。