Post-training of foundation language models has emerged as a promising research domain in federated learning (FL) with the goal to enable privacy-preserving model improvements and adaptations to user's downstream tasks. Recent advances in this area adopt centralized post-training approaches that build upon black-box foundation language models where there is no access to model weights and architecture details. Although the use of black-box models has been successful in centralized post-training, their blind replication in FL raises several concerns. Our position is that using black-box models in FL contradicts the core principles of federation such as data privacy and autonomy. In this position paper, we critically analyze the usage of black-box models in federated post-training, and provide a detailed account of various aspects of openness and their implications for FL.
翻译:基础语言模型的后训练已成为联邦学习(FL)中一个前景广阔的研究领域,其目标是在保护隐私的前提下实现模型改进并适应用户的下游任务。该领域的最新进展采用集中式后训练方法,这些方法建立在无法获取模型权重与架构细节的黑盒基础语言模型之上。尽管黑盒模型在集中式后训练中取得了成功,但将其盲目复制到联邦学习环境会引发多重隐忧。我们的立场是:在联邦学习中使用黑盒模型违背了数据隐私与自主性等联邦核心原则。本立场论文批判性分析了黑盒模型在联邦后训练中的使用,并详细阐述了开放性的多个维度及其对联邦学习的影响。