Natural language processing (NLP) sees rich mobile applications. To support various language understanding tasks, a foundation NLP model is often fine-tuned in a federated, privacy-preserving setting (FL). This process currently relies on at least hundreds of thousands of labeled training samples from mobile clients; yet mobile users often lack willingness or knowledge to label their data. Such an inadequacy of data labels is known as a few-shot scenario; it becomes the key blocker for mobile NLP applications. For the first time, this work investigates federated NLP in the few-shot scenario (FedFSL). By retrofitting algorithmic advances of pseudo labeling and prompt learning, we first establish a training pipeline that delivers competitive accuracy when only 0.05% (fewer than 100) of the training data is labeled and the remaining is unlabeled. To instantiate the workflow, we further present a system FFNLP, addressing the high execution cost with novel designs. (1) Curriculum pacing, which injects pseudo labels to the training workflow at a rate commensurate to the learning progress; (2) Representational diversity, a mechanism for selecting the most learnable data, only for which pseudo labels will be generated; (3) Co-planning of a model's training depth and layer capacity. Together, these designs reduce the training delay, client energy, and network traffic by up to 46.0$\times$, 41.2$\times$ and 3000.0$\times$, respectively. Through algorithm/system co-design, FFNLP demonstrates that FL can apply to challenging settings where most training samples are unlabeled.
翻译:自然语言处理( NLP) 可以看到丰富的移动应用程序 。 为支持多种语言理解任务, 基础 NLP 模型通常在联合、 隐私保护设置( FL) 中进行精细调整。 目前, 这一过程依靠至少数十万来自移动客户的标签培训样本; 但移动用户往往缺乏贴数据的意愿或知识。 数据标签的这种不足被称为几发假想; 它成为移动 NLP 应用程序的关键阻塞器。 这项工作首次调查了在微小的情景( FedFFSL) 中联合的 NLP 模型。 通过改装假标签和快速学习, 我们首先建立了一个培训管道, 当培训数据只有0.05%( 低于100) 时, 提供有竞争力的准确性; 然而, 移动用户网络的快速化, 我们进一步展示一个系统 FHIPP, 用新设计解决高执行成本。 (1) 课程速度, 它通过输入将具有挑战性标值的NLP 0.0 标值的标值显示与学习进度相当的培训工作流程 ; (2) 代表性多样性, 最多样化, 一个选择最有弹性的客户级的网络 的网络, 网络的模型, 并且能 创造的网络,, 的网络的网络, 的模型, 将使得这些网络 升级的网络能生成的网络能 降低。