In this paper, we develop a general framework to design differentially private expectation-maximization (EM) algorithms in high-dimensional latent variable models, based on the noisy iterative hard-thresholding. We derive the statistical guarantees of the proposed framework and apply it to three specific models: Gaussian mixture, mixture of regression, and regression with missing covariates. In each model, we establish the near-optimal rate of convergence with differential privacy constraints, and show the proposed algorithm is minimax rate optimal up to logarithm factors. The technical tools developed for the high-dimensional setting are then extended to the classic low-dimensional latent variable models, and we propose a near rate-optimal EM algorithm with differential privacy guarantees in this setting. Simulation studies and real data analysis are conducted to support our results.
翻译:在本文中,我们开发了一个总体框架,在高维潜伏变量模型中,根据吵闹的迭代迭代硬推进制,设计不同程度的私人预期-最大化(EM)算法。我们获取拟议框架的统计保障,并将其应用于三个具体模型:高斯混合体、回归组合体和与缺失的共变体的回归体。在每种模型中,我们建立接近最佳的趋同率,同时有不同的隐私限制,并显示提议的算法是最符合对数因素的微速率。然后,为高维设置开发的技术工具扩大到典型的低维潜伏变量模型,我们提出近速-最佳的EM算法,并在此环境中提供不同的隐私保障。为了支持我们的结果,我们进行了模拟研究和真实的数据分析。