In tasks like node classification, image segmentation, and named-entity recognition we have a classifier that simultaneously outputs multiple predictions (a vector of labels) based on a single input, i.e. a single graph, image, or document respectively. Existing adversarial robustness certificates consider each prediction independently and are thus overly pessimistic for such tasks. They implicitly assume that an adversary can use different perturbed inputs to attack different predictions, ignoring the fact that we have a single shared input. We propose the first collective robustness certificate which computes the number of predictions that are simultaneously guaranteed to remain stable under perturbation, i.e. cannot be attacked. We focus on Graph Neural Networks and leverage their locality property - perturbations only affect the predictions in a close neighborhood - to fuse multiple single-node certificates into a drastically stronger collective certificate. For example, on the Citeseer dataset our collective certificate for node classification increases the average number of certifiable feature perturbations from $7$ to $351$.
翻译:在节点分类、图像分割和名称实体识别等任务中,我们有一个分类器,根据单个输入,即分别一个图形、图像或文件,同时输出多重预测(标签矢量),现有的对抗性稳健性证书独立考虑每一项预测,因此对此类任务过于悲观。它们隐含地假定,对手可以使用不同的扰动输入来攻击不同的预测,而忽略我们有一个单一共享输入的事实。我们提出了第一个集体稳健性证书,它计算出同时保证在扰动下保持稳定的预测数,即不能攻击。我们侧重于图形神经网络,并利用其位置属性-只影响近邻的预测-将多个单点证书结合到一个非常强大的集体证书中。例如,在Citeseer数据设置我们的集体证书进行无偏分分类时,将可证实的特征平均从7美元增加到351美元。