We study robust testing and estimation of discrete distributions in the strong contamination model. We consider both the "centralized setting" and the "distributed setting with information constraints" including communication and local privacy (LDP) constraints. Our technique relates the strength of manipulation attacks to the earth-mover distance using Hamming distance as the metric between messages(samples) from the users. In the centralized setting, we provide optimal error bounds for both learning and testing. Our lower bounds under local information constraints build on the recent lower bound methods in distributed inference. In the communication constrained setting, we develop novel algorithms based on random hashing and an $\ell_1/\ell_1$ isometry.
翻译:我们研究强强度污染模型中离散分布的可靠测试和估计。 我们既考虑“ 集中环境”,也考虑“ 信息限制的分布环境”, 包括通信和本地隐私限制。 我们的技术把操纵攻击的强度与地球距离相联系, 使用哈明距离作为用户信息( 样本) 之间的测量标准。 在集中环境下, 我们为学习和测试提供最佳的错误界限。 我们受本地信息限制的下限以最近较低的分布推理约束方法为基础。 在通信限制环境下, 我们开发了基于随机散射和$\ell_1/ ell_ 1$ 等量法的新式算法 。