Prompt tuning is a parameter-efficient method, which learns soft prompts and conditions frozen language models to perform specific downstream tasks. Though effective, prompt tuning under few-shot settings on the one hand heavily relies on a good initialization of soft prompts. On the other hand, it can easily result in overfitting. Existing works leverage pre-training or supervised meta-learning to initialize soft prompts but they cannot data-efficiently generalize to unseen downstream tasks. To address the above problems, this paper proposes a novel Self-sUpervised meta-Prompt learning framework with meta-gradient Regularization for few-shot generalization (SUPMER). We first design a set of self-supervised anchor meta-training tasks with different task formats and further enrich the task distribution with curriculum-based task augmentation. Then a novel meta-gradient regularization method is integrated into meta-prompt learning. It meta-learns to transform the raw gradients during few-shot learning into a domain-generalizable direction, thus alleviating the problem of overfitting. Extensive experiments show that SUPMER achieves better performance for different few-shot downstream tasks, and also exhibits a stronger domain generalization ability.
翻译:提示调整是一种参数有效的方法,可以学习软提示并条件固定的语言模型以执行特定的下游任务。尽管有效,但小样本情况下的提示调整在一方面严重依赖于良好的软提示初始化。另一方面,它很容易导致过拟合。现有的工作利用预训练或监督元学习来初始化软提示,但它们无法数据有效地泛化到未见下游任务。为了解决以上问题,本文提出了一种新型的自监督元提示学习框架,元梯度规则化用于小样本泛化(SUPMER)。我们首先设计了一组具有不同任务格式的自监督锚定元训练任务,并进一步通过基于课程的任务增强丰富了任务分布。然后将一种新颖的元梯度规则化方法集成到元提示学习中。它元学习将少样本学习过程中的原始梯度转化为一个具有领域通用性的方向,从而缓解了过拟合问题。大量实验表明,SUPMER在不同少样本下游任务的性能更好,并展现出更强的领域泛化能力。