关于人口遥感数据统计和分析的地方差异隐私综合调查 (A Comprehensive Survey on Local Differential Privacy Toward Data Statistics and Analysis in Crowdsensing)

Collecting and analyzing massive data generated from smart devices have become increasingly pervasive in crowdsensing, which are the building blocks for data-driven decision-making. However, extensive statistics and analysis of such data will seriously threaten the privacy of participating users. Local differential privacy (LDP) has been proposed as an excellent and prevalent privacy model with distributed architecture, which can provide strong privacy guarantees for each user while collecting and analyzing data. LDP ensures that each user's data is locally perturbed first in the client-side and then sent to the server-side, thereby protecting data from privacy leaks on both the client-side and server-side. This survey presents a comprehensive and systematic overview of LDP with respect to privacy models, research tasks, enabling mechanisms, and various applications. Specifically, we first provide a theoretical summarization of LDP, including the LDP model, the variants of LDP, and the basic framework of LDP algorithms. Then, we investigate and compare the diverse LDP mechanisms for various data statistics and analysis tasks from the perspectives of frequency estimation, mean estimation, and machine learning. What's more, we also summarize practical LDP-based application scenarios. Finally, we outline several future research directions under LDP.

翻译：收集并分析从智能设备产生的大量数据在众人监测中越来越普遍,这是数据驱动决策的基石,然而,对这些数据的广泛统计和分析将严重威胁参与用户的隐私。地方差异隐私(LDP)被提议为分布式结构的极好和普遍的隐私模式,可为每个用户提供强有力的隐私保障,同时收集和分析数据。LDP确保每个用户的数据首先在客户端受到当地干扰,然后发送到服务器端,从而保护数据不受用户和服务器的隐私泄漏。这项调查将全面、系统地概述LDP在隐私模式、研究任务、赋能机制以及各种应用方面的隐私。具体地说,我们首先提供LDP的理论汇总,包括LDP模型、LDP的变式以及LDP算法的基本框架。然后,我们从频率估计、平均估计和机器学习的角度对各种数据统计和分析任务的不同LDP机制进行调查和比较。我们还要从实际应用LDP的情景下总结一些未来研究方向。