随机分配的隐私放大效应 (Privacy amplification by random allocation)

We consider the privacy amplification properties of a sampling scheme in which a user's data is used in $k$ steps chosen randomly and uniformly from a sequence (or set) of $t$ steps. This sampling scheme has been recently applied in the context of differentially private optimization [Chua et al., 2024a, Choquette-Choo et al., 2024] and is also motivated by communication-efficient high-dimensional private aggregation [Asi et al., 2025]. Existing analyses of this scheme either rely on privacy amplification by shuffling which leads to overly conservative bounds or require Monte Carlo simulations that are computationally prohibitive in most practical scenarios. We give the first theoretical guarantees and numerical estimation algorithms for this sampling scheme. In particular, we demonstrate that the privacy guarantees of random $k$-out-of-$t$ allocation can be upper bounded by the privacy guarantees of the well-studied independent (or Poisson) subsampling in which each step uses the user's data with probability $(1+o(1))k/t$. Further, we provide two additional analysis techniques that lead to numerical improvements in several parameter regimes. Altogether, our bounds give efficiently-computable and nearly tight numerical results for random allocation applied to Gaussian noise addition.

翻译：我们研究了一种采样方案的隐私放大特性，该方案中用户数据在从序列（或集合）的$t$个步骤中随机均匀选择的$k$个步骤中被使用。该采样方案近期已被应用于差分隐私优化场景[Chua等人，2024a；Choquette-Choo等人，2024]，同时也受到通信高效的高维隐私聚合研究的启发[Asi等人，2025]。现有对该方案的分析要么依赖于混洗隐私放大方法（导致过于保守的界），要么需要蒙特卡洛模拟（在多数实际场景中计算代价过高）。我们首次为该采样方案提供了理论保证与数值估计算法。特别地，我们证明了随机$k$取$t$分配的隐私保障上界可由经典研究的独立（或泊松）子采样隐私保障界定，其中每个步骤以概率$(1+o(1))k/t$使用用户数据。此外，我们提出了两种额外的分析技术，在多个参数区间内实现了数值改进。综合而言，我们的界限为应用于高斯噪声添加的随机分配方案提供了高效可计算且近乎紧致的数值结果。