FedNLP: 确定自然语言处理任务联邦学习方法的基准 (FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks)

Increasing concerns and regulations about data privacy and sparsity necessitate the study of privacy-preserving, decentralized learning methods for natural language processing (NLP) tasks. Federated learning (FL) provides promising approaches for a large number of clients (e.g., personal devices or organizations) to collaboratively learn a shared global model to benefit all clients while allowing users to keep their data locally. Despite interest in studying FL methods for NLP tasks, a systematic comparison and analysis is lacking in the literature. Herein, we present the FedNLP, a benchmarking framework for evaluating federated learning methods on four different task formulations: text classification, sequence tagging, question answering, and seq2seq. We propose a universal interface between Transformer-based language models (e.g., BERT, BART) and FL methods (e.g., FedAvg, FedOPT, etc.) under various non-IID partitioning strategies. Our extensive experiments with FedNLP provide empirical comparisons between FL methods and helps us better understand the inherent challenges of this direction. The comprehensive analysis points to intriguing and exciting future research aimed at developing FL methods for NLP tasks.

翻译：联邦学习(FL)为大量客户(如个人设备或组织)提供了有希望的方法,以合作学习一个共享的全球模式,使所有客户受益,同时允许用户保持其本地数据。尽管对研究国家语言规划任务中的FL方法感兴趣,文献中缺乏系统比较和分析。在这里,我们介绍了FedNLP, 即一个基准框架,用以评价四种不同任务拟订方法的联邦学习方法:文本分类、序列标记、问答和后续2eq等。我们提出了基于变换语言模式(如个人装置或组织、BERT、BART)和FL方法(如FAvg、FedOPT等)之间的通用界面,目的是根据各种非IID分区战略开发FL方法。我们与FedNLP的广泛实验为 FedNLP方法提供经验性比较,帮助我们更好地了解这一方向的内在挑战。全面分析指出,开发FL任务时,需要先入手,然后进行激动人心的研究。

相关内容

联邦学习

关注 199

联邦学习（Federated Learning）是一种新兴的人工智能基础技术，在 2016 年由谷歌最先提出，原本用于解决安卓手机终端用户在本地更新模型的问题，其设计目标是在保障大数据交换时的信息安全、保护终端数据和个人数据隐私、保证合法合规的前提下，在多参与方或多计算结点之间开展高效率的机器学习。其中，联邦学习可使用的机器学习算法不局限于神经网络，还包括随机森林等重要算法。联邦学习有望成为下一代人工智能协同算法和协作网络的基础。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日