继续培训前,以更好地做到零零和少热、少热、少热的便利 (Continued Pretraining for Better Zero- and Few-Shot Promptability)

Recently introduced language model prompting methods can achieve high accuracy in zero- and few-shot settings while requiring few to no learned task-specific parameters. Nevertheless, these methods still often trail behind full model finetuning. In this work, we investigate if a dedicated continued pretraining stage could improve "promptability", i.e., zero-shot performance with natural language prompts or few-shot performance with prompt tuning. We reveal settings where existing continued pretraining methods lack promptability. We also identify current methodological gaps, which we fill with thorough large-scale experiments. We demonstrate that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings compared to existing methods, up to 31% relative. On the other hand, we find that continued pretraining using MAML-style meta-learning, a method that directly optimizes few-shot promptability, yields subpar performance. We validate our findings with two prompt tuning methods, and, based on our results, we provide concrete recommendations to optimize promptability for different use cases.

翻译：最近引入的语言模式促进方法在零和几发环境中可以实现高度精准,同时需要很少甚至完全没有学习到的特定任务参数。然而,这些方法仍然常常落后于完全的模型微调。在这项工作中,我们调查一个专门的继续培训前阶段能否改进“快速性 ”, 即自然语言提示的零弹性能或快速调试的微弹性能。我们揭示了现有连续培训前方法缺乏及时性的环境。我们还找出了目前的方法差距,我们用彻底的大规模实验来填补这些差距。我们证明,一种简单的方法,继续的预培训,在多任务学习期间包含一种可训练的迅速性,与现有方法相比,导致零和少发环境的迅速性,达到31%的相对性。另一方面,我们发现,继续使用MAML式的元学习方法进行预培训,这是一种直接优化微弹着的快速性能、产生分级性能的方法。我们用两种快速的调整方法来验证我们的调查结果,并根据我们的结果,我们提出了关于优化不同使用案例的及时性能的具体建议。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日