It has been demonstrated that prompt tuning is highly effective in efficiently eliciting knowledge from language models (LMs). However, the prompt tuning still lags behind fine-tuning, especially when the LMs are small. P-tuning v2 (Liu et al., 2021b) makes it comparable with finetuning by adding continuous prompts for every layer of the pre-trained model. However, prepending fixed soft prompts for all instances, regardless of their discrepancy, is doubtful. In particular, the inserted prompt position, length, and the representations of prompts for diversified instances through different tasks could all affect the prompt tuning performance. To fill this gap, we propose dynamic prompting (DP): the position, length, and prompt representation can all be dynamically optimized with respect to different tasks and instances. We conduct comprehensive experiments on the SuperGlue benchmark to validate our hypothesis and demonstrate substantial improvements. We also derive a unified framework for supporting our dynamic prompting strategy. In particular, we use a simple learning network and Gumble- Softmax for learning instance-dependent guidance. Experimental results show that simple instance-level position-aware soft prompts can improve the classification accuracy of up to 6 points on average on five datasets, reducing its gap with fine-tuning. Besides, we also prove its universal usefulness under full-data, few-shot, and multitask regimes. Combining them together can even further unleash the power of DP, narrowing the distance between finetuning.
翻译:事实证明,迅速调试在有效地从语言模型(LMs)获取知识方面非常有效。然而,迅速调试仍然落后于微调,特别是在LMs规模小的情况下。P调幅 v2(Liu等人,2021b)使调幅与微调相仿,在经过培训的模型的每一层都增加连续的提示。然而,预先为各种情况预先设定固定的软调试,无论其差异如何,都令人怀疑。特别是,插入的迅速位置、长度和通过不同任务对多种实例的提示都可能影响迅速调试业绩。为了填补这一差距,我们建议动态的提示(DP):对不同的任务和情况,可以动态地优化位置、长度和迅速的代表性。我们在SupGlue基准上进行全面试验,以验证我们的假设并展示重大改进。我们还为支持动态的快速战略制定了统一框架。特别是,我们使用简单的学习网络和Gumball-Softmax来学习依赖实例的指南。实验结果显示,即使是简单的试测级级一级立场、软调度的软调度数据,我们也可以在五度下改进其平均利用率的精确度数据。</s>