Classical federated learning approaches incur significant performance degradation in the presence of non-IID client data. A possible direction to address this issue is forming clusters of clients with roughly IID data. Most solutions following this direction are iterative and relatively slow, also prone to convergence issues in discovering underlying cluster formations. We introduce federated learning with taskonomy (FLT) that generalizes this direction by learning the task-relatedness between clients for more efficient federated aggregation of heterogeneous data. In a one-off process, the server provides the clients with a pretrained (and fine-tunable) encoder to compress their data into a latent representation, and transmit the signature of their data back to the server. The server then learns the task-relatedness among clients via manifold learning, and performs a generalization of federated averaging. FLT can flexibly handle a generic client relatedness graph, when there are no explicit clusters of clients, as well as efficiently decompose it into (disjoint) clusters for clustered federated learning. We demonstrate that FLT not only outperforms the existing state-of-the-art baselines in non-IID scenarios but also offers improved fairness across clients.
翻译:在存在非IID客户数据的情况下,古老的联邦学习方法造成了显著的性能退化。解决这一问题的可能方向是组成客户群群,拥有大致的 IID 数据。遵循这一方向的大多数解决方案是迭代的,相对缓慢的,在发现基本集群形成时也容易出现趋同问题。我们引入了与任务学(FLT)结合的联邦学习方法(FLT),通过学习客户群之间的任务关联性来概括这一方向,了解客户群之间更为高效的混合混杂数据。在一次性程序中,服务器为客户提供预先训练的(和微调的)编码器,以将其数据压缩成一种潜在的表示方式,并将其数据的签名传送回服务器。服务器随后通过多重学习,在客户中学习任务关联性,并进行联邦平均化。FLT可以灵活地处理通用的客户关联性图表,在没有明确的客户群集的情况下,并有效地将其分解成(不连结的)集群学习组群。我们证明FLT不仅超越了现有的状态基准,而且还在非D情景中提供了更好的公平性。