自动FedNLP: 高效的 FedNLP 框架 (AutoFedNLP: An efficient FedNLP framework)

Transformer-based pre-trained models have revolutionized NLP for superior performance and generality. Fine-tuning pre-trained models for downstream tasks often require private data, for which federated learning is the de-facto approach (i.e., FedNLP). However, our measurements show that FedNLP is prohibitively slow due to the large model sizes and the resultant high network/computation cost. Towards practical FedNLP, we identify as the key building blocks adapters, small bottleneck modules inserted at a variety of model layers. A key challenge is to properly configure the depth and width of adapters, to which the training speed and efficiency is highly sensitive. No silver-bullet configuration exists: the optimal choice varies across downstream NLP tasks, desired model accuracy, and client resources. A silver-bullet configuration does not exist and a non-optimal configuration could significantly slow down the training. To automate adapter configuration, we propose AutoFedNLP, a framework that enhances the existing FedNLP with two novel designs. First, AutoFedNLP progressively upgrades the adapter configuration throughout a training session. Second, AutoFedNLP continuously profiles future adapter configurations by allocating participant devices to trial groups. To minimize client-side computations, AutoFedNLP exploits the fact that a FedNLP client trains on the same samples repeatedly between consecutive changes of adapter configurations, and caches computed activations on clients. Extensive experiments show that AutoFedNLP can reduce FedNLP's model convergence delay to no more than several hours, which is up to 155.5$\times$ faster compared to vanilla FedNLP and 48$\times$ faster compared to strong baselines.

翻译：以变换器为基础的预培训模型已经将NLP革命化为优异性能和通用性能。为下游任务而调整预培训模型通常需要私人数据,对于这些数据,联邦化学习是非facto 方法(即 FedNLP )。然而,我们的测量显示,FedNLP由于模型规模大,且由此导致的网络/计算成本高,其速度太慢,因此FedNLP太慢。对于实用的 FedNLLP,我们将FP 确定为关键构件调控器,在各种模型层中反复插入小型瓶颈模块。一个关键的挑战是如何正确配置调试器的深度和宽度,培训速度和效率都非常敏感。没有银色的组合:最优选择在下游 NLP 任务、理想模型精度精度精度和客户源资源之间有所不同。银色的配置和非优化配置可以大大减缓培训。对于自动调制配置,我们建议AutFP 调整NP,这个框架可以用两种新设计来增强现有的美化 NLP 。首先,对NP 不断比较NP 的客户端调化的升级的升级的升级的升级的客户端对F 将自动升级升级升级到将来的客户到FDalLLL 的升级到整个的升级到F 版本,将自动升级到F 升级到F 升级到F 的升级到F 将使得所有F 升级到F 升级到不断升级到F 升级到F 升级到F 升级到F 升级到F 升级到所有F 升级到F 升级到F 升级到F 的服务器到F 的升级的客户的升级到所有的升级到F 。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【NUS-Xavier 教授】图神经网络应用概述，15页ppt

专知会员服务

52+阅读 · 2021年6月30日