Federated learning (FL) is proving to be one of the most promising paradigms for leveraging distributed resources, enabling a set of clients to collaboratively train a machine learning model while keeping the data decentralized. The explosive growth of interest in the topic has led to rapid advancements in several core aspects like communication efficiency, handling non-IID data, privacy, and security capabilities. However, the majority of FL works only deal with supervised tasks, assuming that clients' training sets are labeled. To leverage the enormous unlabeled data on distributed edge devices, in this paper, we aim to extend the FL paradigm to unsupervised tasks by addressing the problem of anomaly detection in decentralized settings. In particular, we propose a novel method in which, through a preprocessing phase, clients are grouped into communities, each having similar majority (i.e., inlier) patterns. Subsequently, each community of clients trains the same anomaly detection model (i.e., autoencoders) in a federated fashion. The resulting model is then shared and used to detect anomalies within the clients of the same community that joined the corresponding federated process. Experiments show that our method is robust, and it can detect communities consistent with the ideal partitioning in which groups of clients having the same inlier patterns are known. Furthermore, the performance is significantly better than those in which clients train models exclusively on local data and comparable with federated models of ideal communities' partition.
翻译:联邦学习(FL)被证明是利用分布式资源最有希望的范例之一,使一组客户能够合作培训机器学习模式,同时保持数据分散化。对这个专题的兴趣的爆炸性增长导致在通信效率、处理非IID数据、隐私和安全能力等几个核心方面迅速取得进展。然而,大多数FL的工作仅涉及监督任务,假设客户的培训成套材料贴上标签,假定客户的培训成套材料被贴上标签。本文旨在利用分布式边缘设备上的巨大无标签数据,通过解决分散式环境中异常检测问题,将FL模式扩大到不受监督的任务。特别是,我们提出了一个新颖的方法,通过预处理阶段,将客户分组到社区,每个社区都有相似的多数(即内)模式。随后,每个客户社区只用联合式模型(即自动组合)来进行同样的异常检测。随后,我们共享和使用这一模型的目的是通过解决在分散式进程中发现同一社区客户内部的异常现象,通过解决异常现象的发现问题。我们提出的一个新方法是,通过预处理阶段,将客户组合成群,每个客户都具有类似的模式。 实验表明,在理想的客户之间,其业绩与理想型群比更一致。