硬标签黑盒反向攻击图形神经网络 (A Hard Label Black-box Adversarial Attack Against Graph Neural Networks)

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in various graph structure related tasks such as node classification and graph classification. However, GNNs are vulnerable to adversarial attacks. Existing works mainly focus on attacking GNNs for node classification; nevertheless, the attacks against GNNs for graph classification have not been well explored. In this work, we conduct a systematic study on adversarial attacks against GNNs for graph classification via perturbing the graph structure. In particular, we focus on the most challenging attack, i.e., hard label black-box attack, where an attacker has no knowledge about the target GNN model and can only obtain predicted labels through querying the target model.To achieve this goal, we formulate our attack as an optimization problem, whose objective is to minimize the number of edges to be perturbed in a graph while maintaining the high attack success rate. The original optimization problem is intractable to solve, and we relax the optimization problem to be a tractable one, which is solved with theoretical convergence guarantee. We also design a coarse-grained searching algorithm and a query-efficient gradient computation algorithm to decrease the number of queries to the target GNN model. Our experimental results on three real-world datasets demonstrate that our attack can effectively attack representative GNNs for graph classification with less queries and perturbations. We also evaluate the effectiveness of our attack under two defenses: one is well-designed adversarial graph detector and the other is that the target GNN model itself is equipped with a defense to prevent adversarial graph generation. Our experimental results show that such defenses are not effective enough, which highlights more advanced defenses.

翻译：内建网络( GNNs) 已经取得了各种图表结构相关任务( 如节点分类和图形分类) 的最先进性能。然而, GNNs 很容易受到对抗性攻击。现有的工程主要侧重于攻击 GNNs 进行节点分类; 然而, 尚未很好地探索对GNNs 进行图形分类的攻击。在这项工作中, 我们对GNNs 的对抗性攻击进行系统研究, 以便通过干扰图形结构进行图表分类。特别是, 我们侧重于最具挑战性的攻击, 即硬标签黑箱攻击, 攻击者对目标GNN目标模型一无所知, 只能通过查询目标模型获得预测的标签。为了实现这一目标, 我们把攻击作为最优化问题, 目标是尽量减少在图表中被包围的边缘数量。最初的优化问题难以解决, 我们通过理论融合保证, 最能减轻优化性的问题是, 最能避免最易感触动性的问题。我们还设计一个不精确的图形搜索的GNNNUR值的模型, 也能够有效地显示我们真正的G 快速的G 数字。