Supervised learning, while prevalent for information cascade modeling, often requires abundant labeled data in training, and the trained model is not easy to generalize across tasks and datasets. Semi-supervised learning facilitates unlabeled data for cascade understanding in pre-training. It often learns fine-grained feature-level representations, which can easily result in overfitting for downstream tasks. Recently, contrastive self-supervised learning is designed to alleviate these two fundamental issues in linguistic and visual tasks. However, its direct applicability for cascade modeling, especially graph cascade related tasks, remains underexplored. In this work, we present Contrastive Cascade Graph Learning (CCGL), a novel framework for cascade graph representation learning in a contrastive, self-supervised, and task-agnostic way. In particular, CCGL first designs an effective data augmentation strategy to capture variation and uncertainty. Second, it learns a generic model for graph cascade tasks via self-supervised contrastive pre-training using both unlabeled and labeled data. Third, CCGL learns a task-specific cascade model via fine-tuning using labeled data. Finally, to make the model transferable across datasets and cascade applications, CCGL further enhances the model via distillation using a teacher-student architecture. We demonstrate that CCGL significantly outperforms its supervised and semi-supervised counterpartsfor several downstream tasks.
翻译:受监督的学习虽然在信息级联模式中很普遍,但往往需要大量在培训中提供标签数据,而经过培训的模型则不易在任务和数据集之间进行普及。 半监督的学习为在培训前进行级联理解提供无标签的数据。 它常常学习精细的地貌层次表现,这很容易造成对下游任务的过度适应。 最近, 对比式的自我监督学习旨在缓解语言和视觉任务中的这两个基本问题。 但是,它直接适用于级联模型,特别是图形级联相关任务,仍然没有得到充分利用。 在这项工作中,我们介绍了对比性级联的级联图学习(CCGL),这是一个用于级联图展示的新型框架,用于在对比性、自我监督性和任务级联调方式上学习。 特别是,CCCGL首先设计有效的数据增强战略,以捕捉到变异性和不确定性。 其次,它通过自上标签和标签的双重培训前自制数据,学习一个图形级联级联的通用模式模式模式。 第三,CGL学习一个通过精确的升级应用的模型,通过升级的模型进一步展示。