Local differential privacy (LDP), which perturbs the data of each user locally and only sends the noisy version of her information to the aggregator, is a popular privacy-preserving data collection mechanism. In LDP, the data collector could obtain accurate statistics without access to original data, thus guaranteeing privacy. However, a primary drawback of LDP is its disappointing utility in high-dimensional space. Although various LDP schemes have been proposed to reduce perturbation, they share the same and naive aggregation mechanism at the side of the collector. In this paper, we first bring forward an analytical framework to generally measure the utilities of LDP mechanisms in high-dimensional space, which can benchmark existing and future LDP mechanisms without conducting any experiment. Based on this, the framework further reveals that the naive aggregation is sub-optimal in high-dimensional space, and there is much room for improvement. Motivated by this, we present a re-calibration protocol HDR4ME for high-dimensional mean estimation, which improves the utilities of existing LDP mechanisms without making any change to them. Both theoretical analysis and extensive experiments confirm the generality and effectiveness of our framework and protocol.
翻译:当地差异隐私(LDP)扰乱了当地每个用户的数据,只是将她的信息的噪音版本传送给聚合器,它是一个流行的隐私保护数据收集机制,在LDP中,数据收集员可以获取准确的统计数据,而不能查阅原始数据,从而保障隐私;然而,LDP的主要缺点是其在高维空间的效用令人失望。虽然提出了各种LDP计划以减少扰动,但它们在收藏者一边拥有同样和天真的汇总机制。在本文中,我们首先提出一个分析框架,以便普遍测量高维空间的LDP机制的效用,这种机制可以在不进行任何实验的情况下对现有和未来的LDP机制进行基准评估。基于这一点,这个框架进一步表明,天性汇总在高维空间是次优的,而且有很大的改进空间。受此影响,我们提出了一个高维度中值估计的重新校正协议(HRD4ME),它改进了现有LDP机制的效用,而没有对它们作出任何改变。理论分析和广泛的实验都证实了我们的框架和协议的普遍性和有效性。