Data privacy and class imbalance are the norm rather than the exception in many machine learning tasks. Recent attempts have been launched to, on one side, address the problem of learning from pervasive private data, and on the other side, learn from long-tailed data. However, both assumptions might hold in practical applications, while an effective method to simultaneously alleviate both issues is yet under development. In this paper, we focus on learning with long-tailed (LT) data distributions under the context of the popular privacy-preserved federated learning (FL) framework. We characterize three scenarios with different local or global long-tailed data distributions in the FL framework, and highlight the corresponding challenges. The preliminary results under different scenarios reveal that substantial future work are of high necessity to better resolve the characterized federated long-tailed learning tasks.
翻译:在许多机器学习任务中,数据隐私和阶级不平衡是常态而非例外。最近有人试图一方面解决从普遍的私人数据中学习的问题,另一方面则努力从长期数据中学习。然而,两种假设都可能在实际应用中有效,而同时缓解这两个问题的有效方法仍在开发之中。在本文件中,我们的重点是在普及的隐私保护联邦学习框架内,用长效(LT)数据传播来学习。我们用三种情景来描述不同地方或全球的远效数据在FL框架中的分布,并突出相应的挑战。不同情景下的初步结果显示,今后大量工作对于更好地解决具有特征的联邦长效学习任务非常必要。