不要产生我:用辛克霍恩差异来培训不同私人创造模型 (Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence)

Although machine learning models trained on massive data have led to break-throughs in several areas, their deployment in privacy-sensitive domains remains limited due to restricted access to data. Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead. We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy. DP-Sinkhorn minimizes the Sinkhorn divergence, a computationally efficient approximation to the exact optimal transport distance, between the model and data in a differentially private manner and uses a novel technique for control-ling the bias-variance trade-off of gradient estimates. Unlike existing approaches for training differentially private generative models, which are mostly based on generative adversarial networks, we do not rely on adversarial objectives, which are notoriously difficult to optimize, especially in the presence of noise imposed by privacy constraints. Hence, DP-Sinkhorn is easy to train and deploy. Experimentally, we improve upon the state-of-the-art on multiple image modeling benchmarks and show differentially private synthesis of informative RGB images. Project page:https://nv-tlabs.github.io/DP-Sinkhorn.

翻译：虽然在大规模数据方面受过培训的机器学习模式导致在几个领域出现突破,但由于对数据的限制,在对隐私敏感的领域的部署仍然有限,由于对数据的限制,对隐私敏感的领域的部署仍然有限。在私人数据方面受到隐私限制的示范性模式可以回避这一挑战,而提供间接获取私人数据的机会。我们提议DP-Sinkhorn,这是一个全新的基于运输的最佳基因化方法,用于从私人数据中学习有差别隐私的数据分配;DP-Sinkhorn尽量减少Sinkhorn的差异,一种计算效率接近精确的最佳运输距离,以不同的私人方式对模型和数据加以接近,并使用新的技术来控制梯度估计数的偏差性差异性交易。与现有的培训有差别的私人基因化模型的方法不同,这些模型主要基于基因对抗网络,我们并不依赖众所周知难以优化的对抗性目标,特别是由于隐私限制造成的噪音。因此,DP-Sinkhorn很容易进行培训和部署。实验,我们改进了州-艺术的多种图像建模基准,并显示信息性RGB-Obs. Project page:http://shorg-lav-DP.ps.ps.