While existing federated learning approaches mostly require that clients have fully-labeled data to train on, in realistic settings, data obtained at the client-side often comes without any accompanying labels. Such deficiency of labels may result from either high labeling cost, or difficulty of annotation due to the requirement of expert knowledge. Thus the private data at each client may be either partly labeled, or completely unlabeled with labeled data being available only at the server, which leads us to a new practical federated learning problem, namely Federated Semi-Supervised Learning (FSSL). In this work, we study two essential scenarios of FSSL based on the location of the labeled data. The first scenario considers a conventional case where clients have both labeled and unlabeled data (labels-at-client), and the second scenario considers a more challenging case, where the labeled data is only available at the server (labels-at-server). We then propose a novel method to tackle the problems, which we refer to as Federated Matching (FedMatch). FedMatch improves upon naive combinations of federated learning and semi-supervised learning approaches with a new inter-client consistency loss and decomposition of the parameters for disjoint learning on labeled and unlabeled data. Through extensive experimental validation of our method in the two different scenarios, we show that our method outperforms both local semi-supervised learning and baselines which naively combine federated learning with semi-supervised learning. The code is available at https://github.com/wyjeong/FedMatch.
翻译:虽然现有的联邦化学习方法大多要求客户拥有完全标签的数据,以便在现实的环境中对在客户方获得的数据进行培训,但在现实环境中,在客户方获得的数据往往没有任何附带标签。这种标签不足的原因可能是标签成本高,或者由于专家知识的要求而难以批注。因此,每个客户的私人数据可能部分贴上标签,或者完全不贴上标签,只有服务器才有标签数据,这导致我们出现一个新的实际的联邦化学习问题,即Federal-Servication(FSSL)。在这项工作中,我们根据标签数据的位置研究FSS的两种基本情景。第一种情景是,客户既贴标签成本高,又贴标签难。第二种情景则考虑一个更具挑战性的情况,即只有服务器(标签-服务器)才能提供贴标签数据。我们随后提出了一种解决问题的新颖方法,即FedMatch(FedMatch),根据标签数据的位置,研究FSLSLS的两种基本情景。 FedMSerview 改进了本地化组合,即FedMild-comniversal 学习方法的不精确组合, 和半超级化的学习。