Self-supervised pretraining has made few-shot learning possible for many NLP tasks. But the pretraining objectives are not typically adapted specifically for in-context few-shot learning. In this paper, we propose to use self-supervision in an intermediate training stage between pretraining and downstream few-shot usage with the goal to teach the model to perform in-context few shot learning. We propose and evaluate four self-supervised objectives on two benchmarks. We find that the intermediate self-supervision stage produces models that outperform strong baselines. Ablation study shows that several factors affect the downstream performance, such as the amount of training data and the diversity of the self-supervised objectives. Human-annotated cross-task supervision and self-supervision are complementary. Qualitative analysis suggests that the self-supervised-trained models are better at following task requirements.
翻译:自我监督的预科培训使许多NLP任务可以少见的学习。 但是,预科培训的目标通常不是专门为内流的少见的学习而调整的。 在本文中,我们提议在培训前和下游的少见使用之间的中间培训阶段使用自我监督的视野,目的是教模型进行内流的学习。我们提出和评估两个基准的四项自我监督目标。我们发现中间自监督阶段产生的模型优于强力基线。吸收研究表明,影响下游业绩的因素有几个,如培训数据的数量和自监督目标的多样性。附加说明的跨任务监督和自监督视野是相辅相成的。定性分析表明,自我监督的模型在遵守任务要求方面做得更好。