Graph Neural Networks (GNNs) have demonstrated strong performance across tasks such as node classification, link prediction, and graph classification, but remain vulnerable to backdoor attacks that implant imperceptible triggers during training to control predictions. While node-level attacks exploit local message passing, graph-level attacks face the harder challenge of manipulating global representations while maintaining stealth. We identify two main sources of anomaly in existing graph classification backdoor methods: structural deviation from rare subgraph triggers and semantic deviation caused by label flipping, both of which make poisoned graphs easily detectable by anomaly detection models. To address this, we propose DPSBA, a clean-label backdoor framework that learns in-distribution triggers via adversarial training guided by anomaly-aware discriminators. DPSBA effectively suppresses both structural and semantic anomalies, achieving high attack success while significantly improving stealth. Extensive experiments on real-world datasets validate that DPSBA achieves a superior balance between effectiveness and detectability compared to state-of-the-art baselines.
翻译:图神经网络(GNNs)在节点分类、链接预测和图分类等任务中展现出强大性能,但仍易受后门攻击的影响,此类攻击通过在训练过程中植入难以察觉的触发器来控制预测结果。尽管节点级攻击利用局部消息传递机制,图级攻击则面临更严峻的挑战:在保持隐蔽性的同时操纵全局表示。我们发现现有图分类后门方法存在两类主要异常源:由罕见子图触发器导致的结构性偏差,以及由标签翻转引发的语义偏差,这两者均使受污染图易被异常检测模型识别。为解决此问题,我们提出DPSBA——一种通过异常感知判别器引导对抗训练来学习分布内触发器的干净标签后门框架。DPSBA能有效抑制结构性与语义异常,在实现高攻击成功率的同时显著提升隐蔽性。基于真实数据集的广泛实验验证,相较于最先进的基线方法,DPSBA在攻击效能与可检测性之间取得了更优的平衡。