We study the following independence testing problem: given access to samples from a distribution $P$ over $\{0,1\}^n$, decide whether $P$ is a product distribution or whether it is $\varepsilon$-far in total variation distance from any product distribution. For arbitrary distributions, this problem requires $\exp(n)$ samples. We show in this work that if $P$ has a sparse structure, then in fact only linearly many samples are required. Specifically, if $P$ is Markov with respect to a Bayesian network whose underlying DAG has in-degree bounded by $d$, then $\tilde{\Theta}(2^{d/2}\cdot n/\varepsilon^2)$ samples are necessary and sufficient for independence testing.
翻译:我们研究的是以下独立测试问题:鉴于从分发量超过0.1美元的产品中提取样本,我们研究的是以下独立测试问题:如果从分发量超过0.1美元的产品中获取样本,就决定P$是否是一种产品分销,或者是否与任何产品分销的距离相差甚远。对于任意分发,这一问题需要$(n)美元样本。我们在这项工作中显示,如果P$结构稀少,那么实际上只需要线性样本。具体地说,如果P$是针对一个Bayesian网络的Markov,其基础的DAG在程度上受美元约束,那么,$(tilde_theta})(2 ⁇ d/2 ⁇ cdot n/\varepsilon>2)的样本对于独立测试是必要和充分的。