The ubiquity of edge devices has led to a growing amount of unlabeled data produced at the edge. Deep learning models deployed on edge devices are required to learn from these unlabeled data to continuously improve accuracy. Self-supervised representation learning has achieved promising performances using centralized unlabeled data. However, the increasing awareness of privacy protection limits centralizing the distributed unlabeled image data on edge devices. While federated learning has been widely adopted to enable distributed machine learning with privacy preservation, without a data selection method to efficiently select streaming data, the traditional federated learning framework fails to handle these huge amounts of decentralized unlabeled data with limited storage resources on edge. To address these challenges, we propose a Federated on-device Contrastive learning framework with Coreset selection, which we call FedCoCo, to automatically select a coreset that consists of the most representative samples into the replay buffer on each device. It preserves data privacy as each client does not share raw data while learning good visual representations. Experiments demonstrate the effectiveness and significance of the proposed method in visual representation learning.
翻译:边缘装置的普遍存在导致在边缘地带产生的无标签数据越来越多。 边缘装置上部署的深学习模型需要从这些无标签数据中学习,才能不断提高准确性。 自我监督的代表学习已经利用集中的无标签数据取得了有希望的成绩。 但是,对隐私保护的认识的提高限制了在边缘装置上集中分布的无标签图像数据。 联邦学习已被广泛采用,以便能够以隐私保护的方式进行分散的机器学习,而没有数据选择方法来有效选择流数据,而传统的联合学习框架则无法处理这些大量分散的、储量有限的非标签数据。 为了应对这些挑战,我们提议建立一个由Coreset选择(我们称之为FedCoCoCo) 组成的“Fed- develop- develop- contraticive 学习框架 ”, 以自动选择由最有代表性的样本组成的核心集, 进入每个装置的重新播放缓冲。 它保护数据隐私,因为每个客户在学习好的视觉演示时不分享原始数据。 实验显示拟议方法的有效性和意义。