We study the problem of learning from positive and unlabeled (PU) data in the federated setting, where each client only labels a little part of their dataset due to the limitation of resources and time. Different from the settings in traditional PU learning where the negative class consists of a single class, the negative samples which cannot be identified by a client in the federated setting may come from multiple classes which are unknown to the client. Therefore, existing PU learning methods can be hardly applied in this situation. To address this problem, we propose a novel framework, namely Federated learning with Positive and Unlabeled data (FedPU), to minimize the expected risk of multiple negative classes by leveraging the labeled data in other clients. We theoretically analyze the generalization bound of the proposed FedPU. Empirical experiments show that the FedPU can achieve much better performance than conventional supervised and semi-supervised federated learning methods. Code is available at https://github.com/littleSunlxy/FedPU-torch
翻译:我们研究了在联盟环境中从正和无标签(PU)数据中学习的问题,在这种环境中,每个客户只标出其数据集的一小部分,因为资源和时间有限。不同于传统的PU学习中,负级由单级组成,在联盟环境中客户无法识别的负面样本可能来自客户所不知道的多个类别。因此,现有的PU学习方法很难适用于这种情况。为了解决这一问题,我们提议了一个新的框架,即用正和无标签数据进行联合学习(FedPU),以便通过在其他客户中利用标签数据尽量减少预期的多重负级风险。我们从理论上分析了拟议的FDPU的通用约束。经验实验表明,FPU可以比常规的受监管和半受监督的联邦学习方法取得更好的业绩。代码可在 https://github.com/littleSunlxy/FedPU-torch查阅 https://githu.unlxy/FedPU-trch查阅的方法。