Graph Neural Networks (GNNs) have established themselves as the state-of-the-art models for many machine learning applications such as the analysis of social networks, protein interactions and molecules. Several among these datasets contain privacy-sensitive data. Machine learning with differential privacy is a promising technique to allow deriving insight from sensitive data while offering formal guarantees of privacy protection. However, the differentially private training of GNNs has so far remained under-explored due to the challenges presented by the intrinsic structural connectivity of graphs. In this work, we introduce differential privacy for graph-level classification, one of the key applications of machine learning on graphs. Our method is applicable to deep learning on multi-graph datasets and relies on differentially private stochastic gradient descent (DP-SGD). We show results on a variety of synthetic and public datasets and evaluate the impact of different GNN architectures and training hyperparameters on model performance for differentially private graph classification. Finally, we apply explainability techniques to assess whether similar representations are learned in the private and non-private settings and establish robust baselines for future work in this area.
翻译:神经网络(GNNs)是许多机器学习应用的最先进的模型,如社会网络分析、蛋白相互作用和分子等。这些数据集中有一些含有隐私敏感数据。隐私不同的机器学习是一种很有希望的方法,有助于从敏感数据中得出洞见,同时提供隐私保护的正式保障。然而,由于图形内在结构连接带来的挑战,对GNNs的不同私人培训迄今仍未得到充分探讨。在这项工作中,我们引入了图表级分类的隐私差异,这是图表级学习的关键应用之一。我们的方法适用于多面数据集的深层学习,并依赖于差异性私有梯度梯度下降(DP-SGD)。我们展示了各种合成和公共数据集的结果,并评估了不同的GNN结构和培训超参数对差异私人图形分类模型性能的影响。最后,我们运用了解释性技术来评估在私人和非私人环境中是否学习过类似表现,并为今后在这一领域的工作建立可靠的基线。