Federated learning (FL) has emerged as an effective solution to decentralized and privacy-preserving machine learning for mobile clients. While traditional FL has demonstrated its superiority, it ignores the non-iid (independently identically distributed) situation, which widely exists in mobile scenarios. Failing to handle non-iid situations could cause problems such as performance decreasing and possible attacks. Previous studies focus on the "symptoms" directly, as they try to improve the accuracy or detect possible attacks by adding extra steps to conventional FL models. However, previous techniques overlook the root causes for the "symptoms": blindly aggregating models with the non-iid distributions. In this paper, we try to fundamentally address the issue by decomposing the overall non-iid situation into several iid clusters and conducting aggregation in each cluster. Specifically, we propose \textbf{DistFL}, a novel framework to achieve automated and accurate \textbf{Dist}ribution-aware \textbf{F}ederated \textbf{L}earning in a cost-efficient way. DistFL achieves clustering via extracting and comparing the \textit{distribution knowledge} from the uploaded models. With this framework, we are able to generate multiple personalized models with distinctive distributions and assign them to the corresponding clients. Extensive experiments on mobile scenarios with popular model architectures have demonstrated the effectiveness of DistFL.
翻译:联邦学习( FL) 已经成为移动客户的分散和隐私保存机器学习的有效解决方案。 虽然传统的 FL 已经表现出其优越性, 它忽视了移动情景中广泛存在的非二( 单独分布的) 情况, 无法处理非二( 单独分布的) 情况可能会造成诸如性能下降和可能的攻击等问题。 以前的研究直接侧重于“ Symptoms ”, 因为它们试图通过添加常规 FL 模式的额外步骤来提高准确性或检测可能的攻击。 但是, 以前的技术忽略了“ Symptoms” 的根源: 盲目地集成非二分布的模型。 在本文中, 我们试图从根本上解决这个问题, 将整个非二( 单独分布的) 情况分解成几个iid 组, 并在每个组群中进行汇总。 具体地说, 我们提议 textbf{ Distflldf}, 是一个实现自动和准确的 模型{ disbutbut} disbut-aware re 。 但是, 以前的技术忽略了“ sypeutbbttotototoms ” : 在成本化的模型中盲调化的模型中, 和透化的模型中, 通过成本化的模型的模型可以进行比较化的版本化化化的模型, 实现个人流化的模型, 和透化的版本化化的模型, 实现。