In this paper, we investigate local permutation tests for testing conditional independence between two random vectors $X$ and $Y$ given $Z$. The local permutation test determines the significance of a test statistic by locally shuffling samples which share similar values of the conditioning variables $Z$, and it forms a natural extension of the usual permutation approach for unconditional independence testing. Despite its simplicity and empirical support, the theoretical underpinnings of the local permutation test remain unclear. Motivated by this gap, this paper aims to establish theoretical foundations of local permutation tests with a particular focus on binning-based statistics. We start by revisiting the hardness of conditional independence testing and provide an upper bound for the power of any valid conditional independence test, which holds when the probability of observing collisions in $Z$ is small. This negative result naturally motivates us to impose additional restrictions on the possible distributions under the null and alternate. To this end, we focus our attention on certain classes of smooth distributions and identify provably tight conditions under which the local permutation method is universally valid, i.e. it is valid when applied to any (binning-based) test statistic. To complement this result on type I error control, we also show that in some cases, a binning-based statistic calibrated via the local permutation method can achieve minimax optimal power. We also introduce a double-binning permutation strategy, which yields a valid test over less smooth null distributions than the typical single-binning method without compromising much power. Finally, we present simulation results to support our theoretical findings.
翻译:在本文中, 我们调查本地变异测试, 测试两个随机矢量之间是否有条件独立。 本地变异测试决定了一个测试性统计的重要性, 测试由当地打乱的样本进行, 这些样本具有相同的调节变量值 $Z$, 并且它构成了通常的变异方法的自然延伸, 用于无条件独立测试。 尽管其简单和实证支持, 本地变异测试的理论基础仍然不明确 。 基于此差距, 本文旨在建立本地变异测试的理论基础, 特别侧重于基于宾点的统计。 我们首先重新审视有条件独立测试的硬性, 并为任何有效的有条件独立测试的能量提供一个上限。 当观察到以$Z$碰撞的可能性很小的时候, 并且它自然地成为通常的变异方法的自然延伸。 尽管其简单和实证支持, 本地变异测试的理论基础是某些平稳分布, 确定基于本地变异方法的准确性合理性标准, 也就是说, 在任何最佳变异性测试结果时, 我们也可以通过某种正值测试结果 。