We propose a conditional independence (CI) test based on a new measure, the \emph{spectral generalized covariance measure} (SGCM). The SGCM is constructed by approximating the basis expansion of the squared norm of the conditional cross-covariance operator, using data-dependent bases obtained via spectral decompositions of empirical covariance operators. This construction avoids direct estimation of conditional mean embeddings and reduces the problem to scalar-valued regressions, resulting in robust finite-sample size control. Theoretically, we derive the limiting distribution of the SGCM statistic, establish the validity of a wild bootstrap for inference, and obtain uniform asymptotic size control under doubly robust conditions. As an additional contribution, we show that exponential kernels induced by continuous semimetrics of negative type are characteristic on general Polish spaces -- with extensions to finite tensor products -- thereby providing a foundation for applying our test and other kernel methods to complex objects such as distribution-valued data and curves on metric spaces. Extensive simulations indicate that the SGCM-based CI test attains near-nominal size and exhibits power competitive with or superior to state-of-the-art alternatives across a range of challenging scenarios.
翻译:本文提出一种基于新度量——谱广义协方差度量(SGCM)的条件独立性(CI)检验方法。SGCM通过使用经验协方差算子谱分解得到的数据依赖基函数,逼近条件交叉协方差算子平方范数的基展开式而构建。该构造避免了直接估计条件均值嵌入,将问题简化为标量值回归,从而实现了稳健的有限样本规模控制。在理论上,我们推导了SGCM统计量的极限分布,建立了野生自助法推断的有效性,并在双重稳健条件下获得了均匀渐近规模控制。作为额外贡献,我们证明了由负型连续半度量导出的指数核在一般波兰空间上具有特征性(可扩展至有限张量积),从而为将本检验及其他核方法应用于复杂对象(如分布值数据和度量空间上的曲线)提供了理论基础。大量仿真实验表明,基于SGCM的CI检验在多种挑战性场景中均能达到接近名义规模的检验水平,其检验功效与当前最优替代方法相当或更优。