During zero-shot inference with language models (LMs), using hard prompts alone may not be able to fully describe the target task. In this paper, we explore how the retrieval of soft prompts obtained through prompt tuning can assist hard prompts in zero-shot task generalization. Specifically, we train soft prompt embeddings for each prompt through prompt tuning, store the samples of the training instances (hard prompt + input instances) mapped with the prompt embeddings, and retrieve the corresponding prompt embedding of the training instance closest to the query instance during inference. Results show this simple approach enhances the performance of T0 on unseen tasks by outperforming it on 10 out of 11 datasets as well as improving the mean accuracy of T0 on BIG-bench benchmark by 2.39% points while adding only 0.007% additional parameters. Also, using interpolation of multiple embeddings and variance-based ranking further improve accuracy and robustness to different evaluation prompts, widening the performance gap. Finally, we find that retrieving source embeddings trained on similar answer choice formats is more important than those on similar task types. Model checkpoints and code implementation are available at https://github.com/seonghyeonye/RoSPr.
翻译:在对语言模型(LMS)进行零光推断期间,仅使用硬提示即可,无法充分描述目标任务。在本文件中,我们探讨了通过快速调取软提示的检索如何有助于零光任务一般化的硬提示。具体地说,我们通过快速调制,为每个快速测试培训软快速嵌入器,存储与快速嵌入相匹配的培训实例样本(硬快速+输入实例),并检索在推断过程中与查询实例最接近的培训实例的相应快速嵌入。结果显示,这一简单方法通过在11个数据集中超过10个数据集,提高在BIG-bench基准点上提高T0的平均值精度2.39%,同时仅增加0.007%的额外参数。此外,利用多个嵌入和基于差异的排名的互调,进一步提高不同评价提示的准确性和稳健度,扩大绩效差距。最后,我们发现,在类似回答选择格式上培训的重新定位源比类似任务类型中的答案格式更为重要。