Since federated learning (FL) has been introduced as a decentralized learning technique with privacy preservation, statistical heterogeneity of distributed data stays the main obstacle to achieve robust performance and stable convergence in FL applications. Model personalization methods have been studied to overcome this problem. However, existing approaches are mainly under the prerequisite of fully labeled data, which is unrealistic in practice due to the requirement of expertise. The primary issue caused by partial-labeled condition is that, clients with deficient labeled data can suffer from unfair performance gain because they lack adequate insights of local distribution to customize the global model. To tackle this problem, 1) we propose a novel personalized semi-supervised learning paradigm which allows partial-labeled or unlabeled clients to seek labeling assistance from data-related clients (helper agents), thus to enhance their perception of local data; 2) based on this paradigm, we design an uncertainty-based data-relation metric to ensure that selected helpers can provide trustworthy pseudo labels instead of misleading the local training; 3) to mitigate the network overload introduced by helper searching, we further develop a helper selection protocol to achieve efficient communication with acceptable performance sacrifice. Experiments show that our proposed method can obtain superior performance and more stable convergence than other related works with partially labeled data, especially in highly heterogeneous setting.
翻译:自联谊学习(FL)作为隐私保护的分散学习技术开始采用以来,分布数据的统计多样性仍然是在FL应用程序中实现稳健业绩和稳定融合的主要障碍;研究了示范性个性化方法以解决这一问题;然而,现有办法主要在充分贴标签数据的先决条件下,由于专门知识的要求,在实践中不切实际,这是不现实的;部分贴标签条件造成的主要问题是,有标签数据不足的客户可能因为无法充分了解当地分配情况以适应全球模式而获得不公平的绩效收益;为解决这一问题,1 我们提出一个新的个人化半监督性学习模式,允许部分贴标签或未贴标签的客户向数据相关客户(帮助代理商)寻求贴标签援助,从而增强他们对当地数据的认识;2 根据这一模式,我们设计了基于不确定性的数据关系衡量标准,以确保选定的帮助者能够提供可靠的假标签,而不是误导当地培训;3)通过帮助者搜索减少网络超负荷,我们进一步开发了帮助者选择协议,以便实现高效通信,使用可接受的性能牺牲,特别是高额压缩的标签。