Deep learning models for graphs have achieved strong performance for the task of node classification. Despite their proliferation, currently there is no study of their robustness to adversarial attacks. Yet, in domains where they are likely to be used, e.g. the web, adversaries are common. Can deep learning models for graphs be easily fooled? In this work, we introduce the first study of adversarial attacks on attributed graphs, specifically focusing on models exploiting ideas of graph convolutions. In addition to attacks at test time, we tackle the more challenging class of poisoning/causative attacks, which focus on the training phase of a machine learning model. We generate adversarial perturbations targeting the node's features and the graph structure, thus, taking the dependencies between instances in account. Moreover, we ensure that the perturbations remain unnoticeable by preserving important data characteristics. To cope with the underlying discrete domain we propose an efficient algorithm Nettack exploiting incremental computations. Our experimental study shows that accuracy of node classification significantly drops even when performing only few perturbations. Even more, our attacks are transferable: the learned attacks generalize to other state-of-the-art node classification models and unsupervised approaches, and likewise are successful even when only limited knowledge about the graph is given.
翻译:图表的深深学习模型在节点分类任务中取得了很强的成绩。 尽管这些模型扩散, 目前还没有研究它们对于对抗性攻击的强健性。 然而, 在可能使用这些模型的领域, 比如网络, 对手是常见的。 深度学习模型对于图表是否容易被愚弄? 在这项工作中, 我们引入了第一次关于对被分配的图表进行对抗性攻击的研究, 特别侧重于利用图形共变的理念的模型。 除了测试时的攻击, 我们处理更具有挑战性的中毒/ 诱杀性攻击类别, 重点是机器学习模型的培训阶段。 我们生成了针对节点特征和图表结构的对立性干扰, 因此, 将各种实例之间的依赖性联系起来 。 此外, 我们通过保存重要的数据特性来确保这些扰动性仍然无法引起注意。 为了应对潜在的离散域, 我们建议采用高效的算法 Nettack 来利用渐进计算。 我们的实验研究表明, 节点分类的准确性会大幅下降, 即使只进行少量的断断点计算。 甚至更多的是, 我们的攻击是可转移的: 所学到的平面分析方法, 也只是当我们所学到的平面分析时, 成功的平面分析方法是相同的, 。