In this paper, we present a denoising sequence-to-sequence (seq2seq) autoencoder via contrastive learning for abstractive text summarization. Our model adopts a standard Transformer-based architecture with a multi-layer bi-directional encoder and an auto-regressive decoder. To enhance its denoising ability, we incorporate self-supervised contrastive learning along with various sentence-level document augmentation. These two components, seq2seq autoencoder and contrastive learning, are jointly trained through fine-tuning, which improves the performance of text summarization with regard to ROUGE scores and human evaluation. We conduct experiments on two datasets and demonstrate that our model outperforms many existing benchmarks and even achieves comparable performance to the state-of-the-art abstractive systems trained with more complex architecture and extensive computation resources.
翻译:在本文中,我们展示了通过对比性学习解密序列到序列(seq2seq)自动编码器,用于抽象文本汇总。我们的模型采用了标准的变异器结构,具有多层双向编码器和自动递减解码器。为了加强其分解能力,我们采用了自我监督的对比学习以及各种判决级文档增强功能。这两个组成部分,即后二seq自动编码器和对比性学习,通过微调进行联合培训,改进了ROUGE分数和人文评估的文本汇总性能。我们在两个数据集上进行了实验,并证明我们的模型比许多现有基准都好,甚至取得了与经过更复杂结构和广泛计算资源培训的先进抽象系统相似的性能。