Prompt tuning (PT) which only tunes the embeddings of an additional sequence of tokens per task, keeping the pre-trained language model (PLM) frozen, has shown remarkable performance in few-shot learning. Despite this, PT has been shown to rely heavily on good initialization of the prompt embeddings. In this work, we study meta prompt tuning (MPT) to systematically explore how meta-learning can help improve (if it can) cross-task generalization in PT through learning to initialize the prompt embeddings from other relevant tasks. We empirically analyze a representative set of meta learning algorithms in a wide range of adaptation settings with different source/target task configurations on a large set of few-shot tasks. With extensive experiments and analysis, we demonstrate the effectiveness of MPT. We find the improvement to be significant particularly on classification tasks. For other kinds of tasks such as question answering, we observe that while MPT can outperform PT in most cases, it does not always outperform multi-task learning. We further provide an in-depth analysis from the perspective of task similarity.
翻译:快速调试(PT)只对每件任务中附加一系列符号的嵌入进行调试,将预先训练的语言模式(PLM)冻结下来,在几张短片的学习中表现出了显著的成绩。尽管如此,PT仍然显示严重依赖快速嵌入的良好的初始化。我们研究元化快速调试(MPT),以便系统地探索元化学习如何(如果它能够)帮助改进(如果它能够)跨任务在PT中的总体化,通过学习从其他相关任务中初始化快速嵌入。我们从经验上分析了在一系列广泛的适应环境中具有代表性的一套元学习算法,并用不同的源/目标配置对一大批几张短片的任务作了不同配置。我们通过广泛的实验和分析,发现MPT的有效性。我们发现改进特别显著地体现在分类任务上。对于诸如回答问题等其他任务,我们发现尽管MPT在多数情况下都能够超越PT,但并不总是超越多任务学习。我们从类似任务的角度进一步提供深入分析。