The diverse demands of different summarization tasks and their high annotation costs are driving a need for few-shot summarization. However, despite the emergence of many summarization tasks and datasets, the current training paradigm for few-shot summarization systems ignores potentially shareable knowledge in heterogeneous datasets. To this end, we propose \textsc{UniSumm}, a unified few-shot summarization model pre-trained with multiple summarization tasks and can be prefix-tuned to excel at any few-shot summarization datasets. Meanwhile, to better evaluate few-shot summarization systems, under the principles of diversity and robustness, we assemble and publicize a new benchmark \textsc{SummZoo}. It consists of $8$ diverse summarization tasks with multiple sets of few-shot samples for each task, covering both monologue and dialogue domains. Experimental results and ablation studies show that \textsc{UniSumm} outperforms strong baseline systems by a large margin across all tasks in \textsc{SummZoo} under both automatic and human evaluations. We release our code and benchmark at \url{https://github.com/microsoft/UniSumm}.
翻译:不同统称任务的不同要求及其高额注解成本正在导致需要少截面概括。 然而,尽管出现了许多简单总结任务和数据集,但目前对少截面概括系统的培训模式忽略了在多样数据集中可能共享的知识。 为此,我们提议了一个统一的微截面概括模型,先经过多重总结任务的培训,然后是统一的几截面总结任务,然后可以提前调整,以在任何少截面总结数据集中取得优异成绩。 同时,为了更好地评估少截面概括系统,在多样性和稳健原则下,我们收集并公布一个新的基准\textsc{SummZoo}。它包含8美元的多样性汇总任务,每件任务都有多套多样的少截面样本,涵盖单项和对话领域。实验结果和对比研究表明,在文本c{UnisiSummZoo} 和人类基准下,我们自动和软基码(UWe) 和软基码(UMSU) 下) 和软基码(SUL) 下) 。