Deep learning models are known to put the privacy of their training data at risk, which poses challenges for their safe and ethical release to the public. Differentially private stochastic gradient descent is the de facto standard for training neural networks without leaking sensitive information about the training data. However, applying it to models for graph-structured data poses a novel challenge: unlike with i.i.d. data, sensitive information about a node in a graph cannot only leak through its gradients, but also through the gradients of all nodes within a larger neighborhood. In practice, this limits privacy-preserving deep learning on graphs to very shallow graph neural networks. We propose to solve this issue by training graph neural networks on disjoint subgraphs of a given training graph. We develop three random-walk-based methods for generating such disjoint subgraphs and perform a careful analysis of the data-generating distributions to provide strong privacy guarantees. Through extensive experiments, we show that our method greatly outperforms the state-of-the-art baseline on three large graphs, and matches or outperforms it on four smaller ones.
翻译:深层学习模型被认为将培训数据的隐私置于危险之中,这给公众安全和合乎道德的释放带来了挑战。不同的私人随机梯度下降是培训神经网络而不泄露有关培训数据的敏感信息的实际标准。然而,将其应用于图形结构数据模型带来了新的挑战:与i.d.d.数据不同,图中节点的敏感信息不能仅仅通过梯度泄漏,也不可能通过大区内所有节点的梯度渗漏。在实践上,这限制了图中隐私的深度学习到非常浅浅的图形神经网络。我们提议通过在特定培训图的脱节子图上培训图形神经网络来解决这一问题。我们开发了三种随机行走法方法来生成这种脱节子图,并对数据生成分布进行仔细分析,以提供强大的隐私保障。我们通过广泛的实验显示,我们的方法大大超出了三个大图上的最新基线,并且比四个小图相匹配或超出它。