Recently introduced language model prompting methods can achieve high accuracy in zero- and few-shot settings while requiring few to no learned task-specific parameters. Nevertheless, these methods still often trail behind full model finetuning. In this work, we investigate if a dedicated continued pretraining stage could improve "promptability", i.e., zero-shot performance with natural language prompts or few-shot performance with prompt tuning. We reveal settings where existing continued pretraining methods lack promptability. We also identify current methodological gaps, which we fill with thorough large-scale experiments. We demonstrate that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings compared to existing methods, up to 31% relative. On the other hand, we find that continued pretraining using MAML-style meta-learning, a method that directly optimizes few-shot promptability, yields subpar performance. We validate our findings with two prompt tuning methods, and, based on our results, we provide concrete recommendations to optimize promptability for different use cases.
翻译:最近引入的语言模式促进方法在零和几发环境中可以实现高度精准,同时需要很少甚至完全没有学习到的特定任务参数。然而,这些方法仍然常常落后于完全的模型微调。在这项工作中,我们调查一个专门的继续培训前阶段能否改进“快速性 ”, 即自然语言提示的零弹性能或快速调试的微弹性能。我们揭示了现有连续培训前方法缺乏及时性的环境。我们还找出了目前的方法差距,我们用彻底的大规模实验来填补这些差距。我们证明,一种简单的方法,继续的预培训,在多任务学习期间包含一种可训练的迅速性,与现有方法相比,导致零和少发环境的迅速性,达到31%的相对性。另一方面,我们发现,继续使用MAML式的元学习方法进行预培训,这是一种直接优化微弹着的快速性能、产生分级性能的方法。我们用两种快速的调整方法来验证我们的调查结果,并根据我们的结果,我们提出了关于优化不同使用案例的及时性能的具体建议。