Few-shot abstractive summarization has become a challenging task in natural language generation. To support it, we designed a novel soft prompts architecture coupled with a prompt pre-training plus fine-tuning paradigm that is effective and tunes only extremely light parameters. The soft prompts include continuous input embeddings across an encoder and a decoder to fit the structure of the generation models. Importantly, a novel inner-prompt placed in the text is introduced to capture document-level information. The aim is to devote attention to understanding the document that better prompts the model to generate document-related content. The first step in the summarization procedure is to conduct prompt pre-training with self-supervised pseudo-data. This teaches the model basic summarizing capabilities. The model is then fine-tuned with few-shot examples. Experimental results on the CNN/DailyMail and XSum datasets show that our method, with only 0.1% of the parameters, outperforms full-model tuning where all model parameters are tuned. It also surpasses Prompt Tuning by a large margin and delivers competitive results against Prefix-Tuning with 3% of the parameters.
翻译:在自然语言生成过程中,我们设计了一个新的软性提示结构,同时设计了一个快速的训练前前和微调模式,这种结构非常有效,只调节极光参数。软性提示包括连续输入嵌入于一个编码器和解码器中,以适应生成模型的结构。重要的是,在文本中引入了一个新颖的内部提示,以捕捉文件级信息。目的是关注如何理解能够更好地促进模型生成文件相关内容的文件。为了支持它,我们综合程序的第一步是用自我监督的伪数据进行快速的预培训。这是对模型基本概括能力的教学。然后,该模型用几个例子进行细微调整。CNN/DailyMail和XSum数据集的实验结果显示,我们的方法只有0.1%的参数,在所有模型参数都经过调整的地方,超越了全模调。它也大大超过快速调试样,并以3 %的参数对Prefix-Tuning进行了竞争结果。