With increasing privacy concerns on data, recent studies have made significant progress using federated learning (FL) on privacy-sensitive natural language processing (NLP) tasks. Much literature suggests fully fine-tuning pre-trained language models (PLMs) in the FL paradigm can mitigate the data heterogeneity problem and close the performance gap with centralized training. However, large PLMs bring the curse of prohibitive communication overhead and local model adaptation costs for the FL system. To this end, we introduce various parameter-efficient tuning (PETuning) methods into federated learning. Specifically, we provide a holistic empirical study of representative PLMs tuning methods in FL. The experimental results cover the analysis of data heterogeneity levels, data scales, and different FL scenarios. Overall communication overhead can be significantly reduced by locally tuning and globally aggregating lightweight model parameters while maintaining acceptable performance in various FL settings. To facilitate the research of PETuning in FL, we also develop a federated tuning framework FedPETuning, which allows practitioners to exploit different PETuning methods under the FL training paradigm conveniently. The source code is available at \url{https://github.com/iezhuozhuo/FedETuning/tree/deltaTuning}.
翻译:随着对数据的隐私关注日益增加,最近利用对隐私敏感的自然语言处理(NLP)任务的联邦学习(FL)取得了显著进展,许多文献表明,在FL模式中完全精细调整预先培训语言模型(PLM)可以减轻数据差异问题,并通过集中培训缩小绩效差距,然而,大型PLM为FL系统带来了令人望而却步的通信间接费用和当地模式适应成本的诅咒。为此,我们引入了各种参数效率调控(PETuning)的联邦学习方法。具体地说,我们提供了对FL中具有代表性的PLMS调控方法的全面经验性研究。实验结果涵盖了对数据异质水平、数据尺度和不同FL情景的分析。总体通信费可以通过本地调控和全球汇总轻量模型参数,同时保持各种FL环境的可接受性能。为了便利FL的PETuning研究,我们还开发了FPPEPETuning框架,使从业人员能够利用FL培训范式下的不同PETUNMS调控方法。TRANS/FDRUDRUDRUD}源代码可以使用。