Randomized smoothing is one of the most promising frameworks for certifying the adversarial robustness of machine learning models, including Graph Neural Networks (GNNs). Yet, existing randomized smoothing certificates for GNNs are overly pessimistic since they treat the model as a black box, ignoring the underlying architecture. To remedy this, we propose novel gray-box certificates that exploit the message-passing principle of GNNs: We randomly intercept messages and carefully analyze the probability that messages from adversarially controlled nodes reach their target nodes. Compared to existing certificates, we certify robustness to much stronger adversaries that control entire nodes in the graph and can arbitrarily manipulate node features. Our certificates provide stronger guarantees for attacks at larger distances, as messages from farther-away nodes are more likely to get intercepted. We demonstrate the effectiveness of our method on various models and datasets. Since our gray-box certificates consider the underlying graph structure, we can significantly improve certifiable robustness by applying graph sparsification.
翻译:随机平滑是证明包括图形神经网络(GNNs)在内的机器学习模型对抗性强力的最有希望的框架之一。 然而,现有的全球NNs随机平滑证书过于悲观,因为它们把模型当作黑盒,无视基本结构。为了纠正这一点,我们提出了新的灰箱证书,利用GNS的电文传递原则:我们随机拦截信息,并仔细分析敌对控制节点发出的信息到达目标节点的可能性。与现有的证书相比,我们向控制图形中整个节点并能任意操纵节点特性的更强大的对手证明,我们证书为更远距离的攻击提供了更有力的保证,因为更远的节点发出的信息更有可能被截取。我们展示了各种模型和数据集使用的方法的有效性。由于我们的灰箱证书考虑了基本图表结构,我们可以通过应用图形通缩来大大改进可证实的稳健性。