Federated semi-supervised learning (FSSL), facilitates labeled clients and unlabeled clients jointly training a global model without sharing private data. Existing FSSL methods mostly focus on pseudo-labeling and consistency regularization to leverage the knowledge of unlabeled data, which have achieved substantial success on raw data utilization. However, their training procedures suffer from the large deviation from local models of labeled clients and unlabeled clients and the confirmation bias induced by noisy pseudo labels, which seriously damage the performance of the global model. In this paper, we propose a novel FSSL method, named Dual Class-aware Contrastive Federated Semi-Supervised Learning (DCCFSSL), which considers the local class-aware distribution of individual client's data and the global class-aware distribution of all clients' data simultaneously in the feature space. By introducing a dual class-aware contrastive module, DCCFSSL builds a common training goal for different clients to reduce the large deviation and introduces contrastive information in the feature space to alleviate the confirmation bias. Meanwhile, DCCFSSL presents an authentication-reweighted aggregation method to enhance the robustness of the server's aggregation. Extensive experiments demonstrate that DCCFSSL not only outperforms state-of-the-art methods on three benchmarked datasets, but also surpasses the FedAvg with relabeled unlabeled clients on CIFAR-10 and CIFAR-100 datasets. To our best knowledge, we are the first to present the FSSL method that utilizes only 10\% labeled clients of all clients to achieve better performance than the standard federated supervised learning that uses all clients with labeled data.
翻译:联邦半监督学习(FSSL),便利标签客户和非标签客户在不分享私人数据的情况下联合培训全球模型,现有FSSL方法主要侧重于假标签和一致性规范,以利用在原始数据利用方面已取得巨大成功的无标签数据知识;然而,由于标签客户和未标签客户与当地模式大相背离,以及噪音假标签引发的确认偏差,严重影响了全球模型的性能,因此,他们的培训程序受到影响。在本文中,我们提出了一个名为双重分类反差和无标签客户联合培训的FSSL方法,名为双级分类反差半保密学习(DCCSSL),主要侧重于假标签和一致性规范规范,以利用当地分类对单个客户数据的分配,同时运用全球分类对客户数据的传播。 DCCSSL为不同客户设定了一个共同的培训目标,以减少大偏差,并在地段空间引入对比信息,以缓解确认偏差。 同时,DCCSSL为认证的汇总方法,用以提高客户对单个客户的数据的当地分类和全球分类的准确度,也用最新版本的服务器数据基数的准确性数据。