数据增强在机器学习领域多指采用一些方法(比如数据蒸馏,正负样本均衡等)来提高模型数据集的质量,增强数据。

VIP内容

Code:https://github.com/Shen-Lab/GraphCL Paper: https://arxiv.org/abs/2010.13902

对于当前的图神经网络(GNNs)来说,图结构数据的可泛化、可迁移和鲁棒表示学习仍然是一个挑战。与为图像数据而开发的卷积神经网络(CNNs)不同,自监督学习和预训练很少用于GNNs。在这篇文章中,我们提出了一个图对比学习(GraphCL)框架来学习图数据的无监督表示。我们首先设计了四种类型的图扩充来包含不同的先验。然后,我们在四种不同的环境下系统地研究了图扩充的各种组合对多个数据集的影响:半监督、无监督、迁移学习和对抗性攻击。结果表明,与最先进的方法相比,即使不调优扩展范围,也不使用复杂的GNN架构,我们的GraphCL框架也可以生成类似或更好的可泛化性、可迁移性和健壮性的图表示。我们还研究了参数化图增强的范围和模式的影响,并在初步实验中观察了性能的进一步提高。

成为VIP会员查看完整内容
0
27

最新论文

The advent of large pre-trained language models has given rise to rapid progress in the field of Natural Language Processing (NLP). While the performance of these models on standard benchmarks has scaled with size, compression techniques such as knowledge distillation have been key in making them practical. We present, MATE-KD, a novel text-based adversarial training algorithm which improves the performance of knowledge distillation. MATE-KD first trains a masked language model based generator to perturb text by maximizing the divergence between teacher and student logits. Then using knowledge distillation a student is trained on both the original and the perturbed training samples. We evaluate our algorithm, using BERT-based models, on the GLUE benchmark and demonstrate that MATE-KD outperforms competitive adversarial learning and data augmentation baselines. On the GLUE test set our 6 layer RoBERTa based model outperforms BERT-Large.

0
0
下载
预览
Top