The Randomized Response (RR) algorithm is a classical technique to improve robustness in survey aggregation, and has been widely adopted in applications with differential privacy guarantees. We propose a novel algorithm, Randomized Response with Prior (RRWithPrior), which can provide more accurate results while maintaining the same level of privacy guaranteed by RR. We then apply RRWithPrior to learn neural networks with label differential privacy (LabelDP), and show that when only the label needs to be protected, the model performance can be significantly improved over the previous state-of-the-art private baselines. Moreover, we study different ways to obtain priors, which when used with RRWithPrior can additionally improve the model performance, further reducing the accuracy gap between private and non-private models. We complement the empirical results with theoretical analysis showing that LabelDP is provably easier than protecting both the inputs and labels.
翻译:随机反应算法(RR)是提高调查汇总稳健性的一种古典技术,在有差别隐私保障的应用中被广泛采用。 我们提出一种新型算法(RRWERPrior),它可以提供更准确的结果,同时保持RR所保障的同样的隐私水平。 然后我们运用RRRWERPrior来学习带有标签差异隐私的神经网络(LabelDP),并表明只要标签需要保护,模型性能就可以大大改进以往最先进的私人基线。 此外,我们研究获得前科的不同方法,当与RRRRRWERPrior一起使用时,可以进一步提高模型性能,进一步缩小私人和非私人模型之间的准确性差距。我们用理论分析来补充实验结果,表明LabelDP比保护投入和标签都容易。