继续培训前,以更好地做到零零和少热、少热、少热的便利 (Continued Pretraining for Better Zero- and Few-Shot Promptability)

Recently introduced language model prompting methods can achieve high accuracy in zero- and few-shot settings while requiring few to no learned task-specific parameters. Nevertheless, these methods still often trail behind full model finetuning. In this work, we investigate if a dedicated continued pretraining stage could improve "promptability", i.e., zero-shot performance with natural language prompts or few-shot performance with prompt tuning. We reveal settings where existing continued pretraining methods lack promptability. We also identify current methodological gaps, which we fill with thorough large-scale experiments. We demonstrate that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings compared to existing methods, up to 31% relative. On the other hand, we find that continued pretraining using MAML-style meta-learning, a method that directly optimizes few-shot promptability, yields subpar performance. We validate our findings with two prompt tuning methods, and, based on our results, we provide concrete recommendations to optimize promptability for different use cases.

翻译：最近引入的语言模式促进方法在零和几发环境中可以实现高度精准,同时需要很少甚至完全没有学习到的特定任务参数。然而,这些方法仍然常常落后于完全的模型微调。在这项工作中,我们调查一个专门的继续培训前阶段能否改进“快速性 ”, 即自然语言提示的零弹性能或快速调试的微弹性能。我们揭示了现有连续培训前方法缺乏及时性的环境。我们还找出了目前的方法差距,我们用彻底的大规模实验来填补这些差距。我们证明,一种简单的方法,继续的预培训,在多任务学习期间包含一种可训练的迅速性,与现有方法相比,导致零和少发环境的迅速性,达到31%的相对性。另一方面,我们发现,继续使用MAML式的元学习方法进行预培训,这是一种直接优化微弹着的快速性能、产生分级性能的方法。我们用两种快速的调整方法来验证我们的调查结果,并根据我们的结果,我们提出了关于优化不同使用案例的及时性能的具体建议。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日