Locally Differentially Private (LDP) Reports are commonly used for collection of statistics and machine learning in the federated setting. In many cases the best known LDP algorithms require sending prohibitively large messages from the client device to the server (such as when constructing histograms over large domain or learning a high-dimensional model). This has led to significant efforts on reducing the communication cost of LDP algorithms. At the same time LDP reports are known to have relatively little information about the user's data due to randomization. Several schemes are known that exploit this fact to design low-communication versions of LDP algorithm but all of them do so at the expense of a significant loss in utility. Here we demonstrate a general approach that, under standard cryptographic assumptions, compresses every efficient LDP algorithm with negligible loss in privacy and utility guarantees. The practical implication of our result is that in typical applications the message can be compressed to the size of the server's pseudo-random generator seed. More generally, we relate the properties of an LDP randomizer to the power of a pseudo-random generator that suffices for compressing the LDP randomizer. From this general approach we derive low-communication algorithms for the problems of frequency estimation and high-dimensional mean estimation. Our algorithms are simpler and more accurate than existing low-communication LDP algorithms for these well-studied problems.
翻译:本地差异私人(LDP)报告通常用于在联合环境下收集统计数据和机器学习。在许多情况下,最已知的LDP算法需要从客户设备向服务器发送令人望而却步的大信息(例如,在大域上建造直方图或学习高维模型时),这导致大量努力降低LDP算法的通信成本。与此同时,据了解,LDP报告由于随机化,对用户数据的信息相对较少。已知有好几种办法利用这一事实设计低通信版本LDP算法,但所有办法都这样做,其代价是功用损失巨大。在这里,我们展示了一种总的方法,根据标准的加密假设,将每一个高效LDP算法压缩成每个有效的LDP算法,其隐私和效用保障的损失微不足道。我们结果的实际含义是,在典型的应用中,信息可以压缩到服务器的伪随机生成器种子的大小。更一般而言,我们把LDP随机转换器的特性与假随机转换器的能量联系起来,这种能力足以压缩LDP的精确度的耗损。从我们现有的低频加密算算法的低度角度看,这是我们目前对LDP的精确的精确理解方法的精确理解方法。