Existing federated classification algorithms typically assume the local annotations at every client cover the same set of classes. In this paper, we aim to lift such an assumption and focus on a more general yet practical non-IID setting where every client can work on non-identical and even disjoint sets of classes (i.e., client-exclusive classes), and the clients have a common goal which is to build a global classification model to identify the union of these classes. Such heterogeneity in client class sets poses a new challenge: how to ensure different clients are operating in the same latent space so as to avoid the drift after aggregation? We observe that the classes can be described in natural languages (i.e., class names) and these names are typically safe to share with all parties. Thus, we formulate the classification problem as a matching process between data representations and class representations and break the classification model into a data encoder and a label encoder. We leverage the natural-language class names as the common ground to anchor the class representations in the label encoder. In each iteration, the label encoder updates the class representations and regulates the data representations through matching. We further use the updated class representations at each round to annotate data samples for locally-unaware classes according to similarity and distill knowledge to local models. Extensive experiments on four real-world datasets show that the proposed method can outperform various classical and state-of-the-art federated learning methods designed for learning with non-IID data.
翻译:现有的联合分类算法通常假定每个客户都有相同的类别。 在本文中, 我们的目标是取消这样的假设, 并侧重于更一般而实际的非IID设置, 让每个客户都能在不相同甚至脱节的类别( 客户专有类别)上工作, 而客户有一个共同的目标, 就是构建一个全球分类模型, 以辨别这些类别的总和。 客户类组中的这种异质性带来了新的挑战: 如何确保不同的客户在同一的潜在空间运作, 以避免聚合后漂移? 我们观察到, 这样的假设可以使用自然语言( 类名) 来描述这些类别, 而这些名称通常可以安全地与所有当事方分享。 因此, 我们把分类问题设计成一个匹配数据表和类表表之间的匹配程序, 将分类模型破解成一个数据编码器和标签编码。 我们利用自然语言类组的名称作为通用地固定标签编码中的类表表。 在每次标签中, 标签的标签都用非类组本更新, 更新分类表模模型, 并管理为每类类组中的最新数据模型设计的非类表式,, 将每类组样样样图,, 我们使用当地学习方法, 以显示当地学习方法 。