Graph Neural Networks (GNNs) have become a popular tool for learning on graphs, but their widespread use raises privacy concerns as graph data can contain personal or sensitive information. Differentially private GNN models have been recently proposed to preserve privacy while still allowing for effective learning over graph-structured datasets. However, achieving an ideal balance between accuracy and privacy in GNNs remains challenging due to the intrinsic structural connectivity of graphs. In this paper, we propose a new differentially private GNN called ProGAP that uses a progressive training scheme to improve such accuracy-privacy trade-offs. Combined with the aggregation perturbation technique to ensure differential privacy, ProGAP splits a GNN into a sequence of overlapping submodels that are trained progressively, expanding from the first submodel to the complete model. Specifically, each submodel is trained over the privately aggregated node embeddings learned and cached by the previous submodels, leading to an increased expressive power compared to previous approaches while limiting the incurred privacy costs. We formally prove that ProGAP ensures edge-level and node-level privacy guarantees for both training and inference stages, and evaluate its performance on benchmark graph datasets. Experimental results demonstrate that ProGAP can achieve up to 5%-10% higher accuracy than existing state-of-the-art differentially private GNNs.
翻译:图神经网络(GNNs)已成为学习图形数据的流行工具,但随着其广泛使用,隐私问题也随之而来,因为图数据可能包含个人或敏感信息。最近提出了具有差分隐私的GNN模型,以在保护隐私的同时仍允许有效地学习基于图的数据集。然而,在GNN中实现理想的精度和隐私之间的平衡仍然具有挑战性,这是由图的内在结构连接造成的。在本文中,我们提出了一种新的具有差分隐私的GNN ProGAP,它使用逐步训练方案来改善精度和隐私之间的平衡。ProGAP将GNN分裂为一系列重叠的子模型,每个子模型都是在前一子模型学习和缓存的私有聚合节点嵌入上训练的。结合聚合扰动技术来确保差分隐私,这将导致与以前的方法相比具有更高的表达能力,同时限制所导致的隐私成本。我们正式证明ProGAP确保训练和推断阶段的边级和节点级隐私保证,并在基准图数据集上评估其性能。实验结果表明,ProGAP可以比现有的最先进的具有差分隐私的GNN获得高达5%-10%的更高精度。