We study hypothesis testing under communication constraints, where each sample is quantized before being revealed to a statistician. Without communication constraints, it is well known that the sample complexity of simple binary hypothesis testing is characterized by the Hellinger distance between the distributions. We show that the sample complexity of simple binary hypothesis testing under communication constraints is at most a logarithmic factor larger than in the unconstrained setting and this bound is tight. We develop a polynomial-time algorithm that achieves the aforementioned sample complexity. Our framework extends to robust hypothesis testing, where the distributions are corrupted in the total variation distance. Our proofs rely on a new reverse data processing inequality and a reverse Markov inequality, which may be of independent interest. For simple $M$-ary hypothesis testing, the sample complexity in the absence of communication constraints has a logarithmic dependence on $M$. We show that communication constraints can cause an exponential blow-up leading to $\Omega(M)$ sample complexity even for adaptive algorithms.
翻译:我们研究通信限制下的假设测试,每个样本在向统计家披露之前先量化。没有通信限制,众所周知简单的二进制假设测试的样本复杂性的特点是分布之间的距离是Hellinger。我们显示,通信限制下的简单二进制假设测试的样本复杂性最多是一个比未受限制的环境更大的对数系数,而且这一约束是紧凑的。我们开发了一种实现上述样本复杂性的多元时算法。我们的框架延伸到了强力的假设测试,其中分布在完全变异距离中被腐蚀。我们的证据依赖于新的逆向数据处理不平等和反向马尔科夫不平等,这可能具有独立的兴趣。对于简单的美元假设测试来说,在没有通信限制的情况下,抽样复杂性对美元的依赖性比未受限制的环境还要大。我们表明,通信限制可能导致指数性爆炸,导致美元/Omega(M)的样本复杂性,即使是适应性算法也是如此。