We introduce EfficientCL, a memory-efficient continual pretraining method that applies contrastive learning with novel data augmentation and curriculum learning. For data augmentation, we stack two types of operation sequentially: cutoff and PCA jittering. While pretraining steps proceed, we apply curriculum learning by incrementing the augmentation degree for each difficulty step. After data augmentation is finished, contrastive learning is applied on projected embeddings of original and augmented examples. When finetuned on GLUE benchmark, our model outperforms baseline models, especially for sentence-level tasks. Additionally, this improvement is capable with only 70% of computational memory compared to the baseline model.
翻译:我们引入了高效CL, 这是一种记忆高效的连续培训前方法,它应用了与新数据增强和课程学习的对比学习。 对于数据增强,我们依次堆叠了两种类型的操作:截断和倾斜五氯苯甲醚。在培训前的步骤中,我们通过增加每个困难步骤的增量程度来应用课程学习。在数据增强完成后,对原始和增量示例的预测嵌入应用了对比学习。在微调GLUE基准时,我们的模型比基线模型要好,特别是用于判决层面的任务。此外,这一改进与基线模型相比,只有70%的计算内存。