Federated learning is a paradigm that enables local devices to jointly train a server model while keeping the data decentralized and private. In federated learning, since local data are collected by clients, it is hardly guaranteed that the data are correctly annotated. Although a lot of studies have been conducted to train the networks robust to these noisy data in a centralized setting, these algorithms still suffer from noisy labels in federated learning. Compared to the centralized setting, clients' data can have different noise distributions due to variations in their labeling systems or background knowledge of users. As a result, local models form inconsistent decision boundaries and their weights severely diverge from each other, which are serious problems in federated learning. To solve these problems, we introduce a novel federated learning scheme that the server cooperates with local models to maintain consistent decision boundaries by interchanging class-wise centroids. These centroids are central features of local data on each device, which are aligned by the server every communication round. Updating local models with the aligned centroids helps to form consistent decision boundaries among local models, although the noise distributions in clients' data are different from each other. To improve local model performance, we introduce a novel approach to select confident samples that are used for updating the model with given labels. Furthermore, we propose a global-guided pseudo-labeling method to update labels of unconfident samples by exploiting the global model. Our experimental results on the noisy CIFAR-10 dataset and the Clothing1M dataset show that our approach is noticeably effective in federated learning with noisy labels.
翻译:联邦学习是一种范例,它使本地设备能够联合培训服务器模型,同时保持数据分散和私有。在联合学习中,由于本地数据是由客户收集的,因此很难保证对本地数据进行正确的附加说明。虽然已经进行了许多研究,以在集中环境下对网络进行强力培训,以适应这些吵闹的数据,但这些算法仍然在联盟学习过程中受到吵闹标签的困扰。与中央设置相比,客户数据可以因其标签系统或用户背景知识的变化而产生不同的噪音分布。因此,本地模型形成不一致的决定界限及其重量彼此严重差异,这是联邦学习过程中的严重问题。为了解决这些问题,我们引入了一个全新的联合学习计划,即服务器与本地模型合作,通过互换阶级分数来保持一致的决定界限。这些计算法是每个设备的本地数据的核心特征,每轮通信模型都与服务器相匹配。更新本地模型有助于在本地模型之间形成一致的决定界限,尽管客户的杂音分发数据在联合学习过程中存在严重的问题。为了解决这些问题,我们采用了一个新的联合学习方法来改进本地模型,我们采用一个标签的标签,我们采用新的标签方式,我们采用新的方法,我们使用一个地方模型来更新了一种新的标签。