Private regression has received attention from both database and security communities. Recent work by Fredrikson et al. (USENIX Security 2014) analyzed the functional mechanism (Zhang et al. VLDB 2012) for training linear regression models over medical data. Unfortunately, they found that model accuracy is already unacceptable with differential privacy when $\varepsilon = 5$. We address this issue, presenting an explicit connection between differential privacy and stable learning theory through which a substantially better privacy/utility tradeoff can be obtained. Perhaps more importantly, our theory reveals that the most basic mechanism in differential privacy, output perturbation, can be used to obtain a better tradeoff for all convex-Lipschitz-bounded learning tasks. Since output perturbation is simple to implement, it means that our approach is potentially widely applicable in practice. We go on to apply it on the same medical data as used by Fredrikson et al. Encouragingly, we achieve accurate models even for $\varepsilon = 0.1$. In the last part of this paper, we study the impact of our improved differentially private mechanisms on model inversion attacks, a privacy attack introduced by Fredrikson et al. We observe that the improved tradeoff makes the resulting differentially private model more susceptible to inversion attacks. We analyze this phenomenon formally.
翻译:由Fredrikson等人(USENIX Security,2014年)最近分析了功能机制(Zhang等人,VLDB,2012年),以培训线性回归模型取代医疗数据。不幸的是,他们发现模型准确性已经无法接受,因为当美元=5美元时,隐私差异值为差异值=5美元时,隐私差异值与稳定学习理论之间的明确联系,从而可以大大改善隐私/使用权的权衡。也许更重要的是,我们的理论表明,在不同的隐私、产出渗透等功能机制中,最基本的机制可以被用来为所有隐蔽-Lipschitz受约束的学习任务获得更好的交易。由于产出渗透容易实施,这意味着我们的方法有可能在实际中广泛适用。我们继续采用与Fredrikson等人使用的相同的医疗数据。令人鼓舞的是,我们甚至为美元=vareepsilon =0.1美元都取得了准确的模型。在本文最后一部分中,我们研究了我们改进的私人机制对模型的反向攻击的影响。我们用这种变式模型观察了这种变式的变式,我们用Fredrical 做了更易变式分析。