Two major challenges in distributed learning and estimation are 1) preserving the privacy of the local samples; and 2) communicating them efficiently to a central server, while achieving high accuracy for the end-to-end task. While there has been significant interest in addressing each of these challenges separately in the recent literature, treatments that simultaneously address both challenges are still largely missing. In this paper, we develop novel encoding and decoding mechanisms that simultaneously achieve optimal privacy and communication efficiency in various canonical settings. In particular, we consider the problems of mean estimation and frequency estimation under $\varepsilon$-local differential privacy and $b$-bit communication constraints. For mean estimation, we propose a scheme based on Kashin's representation and random sampling, with order-optimal estimation error under both constraints. For frequency estimation, we present a mechanism that leverages the recursive structure of Walsh-Hadamard matrices and achieves order-optimal estimation error for all privacy levels and communication budgets. As a by-product, we also construct a distribution estimation mechanism that is rate-optimal for all privacy regimes and communication constraints, extending recent work that is limited to $b=1$ and $\varepsilon=O(1)$. Our results demonstrate that intelligent encoding under joint privacy and communication constraints can yield a performance that matches the optimal accuracy achievable under either constraint alone.
翻译:分布式学习和估算方面的两大挑战是:(1) 维护当地样本的隐私;(2) 将样本有效地传递给中央服务器,同时实现端到端任务的高度准确性;虽然最近文献对分别应对其中每一项挑战都有很大兴趣,但同时应对这两项挑战的处理方法仍然基本缺乏。在本文中,我们开发了新的编码和解码机制,同时在各种气候环境中实现最佳隐私和通信效率。特别是,我们考虑在$\varepsilon-当地差异隐私和美元比特通信限制下的平均估计和频率估计问题。关于平均估算,我们提议基于Kashin的表述和随机抽样,在两种限制下都存在定点最佳估计错误。关于频率估算,我们提出了一个机制,利用沃尔什-哈马德基体的循环结构,实现所有隐私水平和通信预算的定点最佳估计错误。作为副产品,我们还建立了一种对所有隐私制度和通信限制都最优化的分布估计机制。关于平均估算机制,将最近的工作扩大到以Kashin的准确性($=1美元)和在最佳的通信限制下,能够展示我们最佳的智能安全度(美元)下,在最佳的升级(美元)下,能够显示最佳的精确性要求。