非二二二二一数据方面的联邦学习:调查 (Federated Learning on Non-IID Data: A Survey)

Federated learning is an emerging distributed machine learning framework for privacy preservation. However, models trained in federated learning usually have worse performance than those trained in the standard centralized learning mode, especially when the training data are not independent and identically distributed (Non-IID) on the local devices. In this survey, we pro-vide a detailed analysis of the influence of Non-IID data on both parametric and non-parametric machine learning models in both horizontal and vertical federated learning. In addition, cur-rent research work on handling challenges of Non-IID data in federated learning are reviewed, and both advantages and disadvantages of these approaches are discussed. Finally, we suggest several future research directions before concluding the paper.

翻译：联邦学习是新出现的隐私保护分布式机器学习框架,但是,联邦学习培训模式的性能通常比在标准集中学习模式中培训的模式差,特别是当培训数据不独立,当地设备没有同样分布(非二维)时,尤其如此。在这次调查中,我们赞成详细分析非二维数据对横向和纵向联邦学习中的参数和非参数机器学习模式的影响。此外,还审查了关于处理联邦学习中非二维数据挑战的法外研究工作,并讨论了这些方法的利弊。最后,我们建议在完成文件之前提出若干未来研究方向。

相关内容

联邦学习

关注 200

联邦学习（Federated Learning）是一种新兴的人工智能基础技术，在 2016 年由谷歌最先提出，原本用于解决安卓手机终端用户在本地更新模型的问题，其设计目标是在保障大数据交换时的信息安全、保护终端数据和个人数据隐私、保证合法合规的前提下，在多参与方或多计算结点之间开展高效率的机器学习。其中，联邦学习可使用的机器学习算法不局限于神经网络，还包括随机森林等重要算法。联邦学习有望成为下一代人工智能协同算法和协作网络的基础。