培训前语言模式黑盒快速学习 (Black-box Prompt Learning for Pre-trained Language Models)

The increasing scale of general-purpose Pre-trained Language Models (PLMs) necessitates the study of more efficient adaptation across different downstream tasks. In this paper, we establish a Black-box Discrete Prompt Learning (BDPL) to resonate with pragmatic interactions between the cloud infrastructure and edge devices. Particularly, instead of fine-tuning the model in the cloud, we adapt PLMs by prompt learning, which efficiently optimizes only a few parameters of the discrete prompts. Moreover, we consider the scenario that we do not have access to the parameters and gradients of the pre-trained models, except for its outputs given inputs. This black-box setting secures the cloud infrastructure from potential attack and misuse to cause a single-point failure, which is preferable to the white-box counterpart by current infrastructures. Under this black-box constraint, we apply a variance-reduced policy gradient algorithm to estimate the gradients of parameters in the categorical distribution of each discrete prompt. In light of our method, the user devices can efficiently tune their tasks by querying the PLMs bounded by a range of API calls. Our experiments on RoBERTa and GPT-3 demonstrate that the proposed algorithm achieves significant improvement on eight benchmarks in a cloud-device collaboration manner. Finally, we conduct in-depth case studies to comprehensively analyze our method in terms of various data sizes, prompt lengths, training budgets, optimization objectives, prompt transferability, and explanations of the learned prompts. Our code will be available at https://github.com/shizhediao/Black-Box-Prompt-Learning.

翻译：培训前通用语言模型(PLM)规模的扩大使得有必要对不同下游任务进行更高效的适应性研究。在本文件中,我们建立了一个黑盒分立快速学习(BDPL),以与云层基础设施与边缘设备之间的务实互动产生共鸣。特别是,我们不微调云层模型,而是通过快速学习来调整PLMS,这只有效地优化了离散提示的几个参数。此外,我们认为,除了提供的产出外,我们无法获取预先培训模型的参数和梯度。这个黑盒设置可以确保云层基础设施不受潜在的攻击和滥用,从而导致单点故障,这比当前基础设施的白盒对应方更为可取。在这种黑盒限制下,我们应用了差异化政策梯度算法来估计每个离散提示的绝对分布参数的梯度。此外,根据我们的方法,用户装置可以有效地调整它们的任务,通过对PLMS进行一系列的长度调控。我们关于ROBTA和GPT-3的精确性解释,我们有关ROBTA/规则的实验,我们以透明方式对透明化的精确地分析我们的标准分析了各种数据分析。