We study the task of training regression models with the guarantee of label differential privacy (DP). Based on a global prior distribution on label values, which could be obtained privately, we derive a label DP randomization mechanism that is optimal under a given regression loss function. We prove that the optimal mechanism takes the form of a ``randomized response on bins'', and propose an efficient algorithm for finding the optimal bin values. We carry out a thorough experimental evaluation on several datasets demonstrating the efficacy of our algorithm.
翻译:我们研究了以标签差异隐私(DP)保障来培训回归模型的任务。根据全球先前的标签价值分布(可以私下获得),我们得出了一个标签 DP 随机化机制,在特定的回归损失功能下是最佳的。我们证明,最佳机制的形式是“对垃圾箱的随机反应 ”, 并提出了找到最佳垃圾箱值的有效算法。我们对某些显示我们算法有效性的数据集进行了彻底的实验性评估。