改进中度微调和数据增强的零度和很少热的抽象摘要总结 (Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation)

Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks. However, these models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains. In this work, we introduce a novel and generalizable method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner. WikiTransfer fine-tunes pretrained models on pseudo-summaries, produced from generic Wikipedia data, which contain characteristics of the target dataset, such as the length and level of abstraction of the desired summaries. WikiTransfer models achieve state-of-the-art, zero-shot abstractive summarization performance on the CNN-DailyMail dataset and demonstrate the effectiveness of our approach on three additional diverse datasets. These models are more robust to noisy data and also achieve better or comparable few-shot performance using 10 and 100 training examples when compared to few-shot transfer from other summarization datasets. To further boost performance, we employ data augmentation via round-trip translation as well as introduce a regularization term for improved few-shot transfer. To understand the role of dataset aspects in transfer performance and the quality of the resulting output summaries, we further study the effect of the components of our unsupervised fine-tuning data and analyze few-shot performance using both automatic and human evaluation.

翻译：在大型文本组合中,经过自我监督目标的模型先于大型文本组合的大型文本组合,先于自我监督目标,然后在英文文本汇总任务中实现最先进的业绩。然而,这些模型通常对数十万个数据组合的特性进行微调,这是在对新的、特定域进行汇总时的一项不可行的要求。在这项工作中,我们引入了一种创新的、可概括化的方法,称为WikiTransfer,以未受监督的、特定数据集的方式微调预先经过培训的模型,用于对大文本组合进行精细化。Wiki Transfer 微调微调的伪摘要模型,包含目标数据集的特性,如预期摘要的长度和抽象程度。Wiki Transfer模型在对新的、特定域域域域应用最先进的零光速、抽象化的实绩,并展示我们在另外三个不同数据集中采用的方法。这些模型比起微调数据组合的10个和100个培训示例,我们使用非光谱的模拟数据组合,同时将数据转换作为改进的数据转换过程的一部分。