Testing the dependency between two random variables is an important inference problem in statistics since many statistical procedures rely on the assumption that the two samples are independent. To test whether two samples are independent, a so-called HSIC (Hilbert--Schmidt Independence Criterion)-based test has been proposed. Its null distribution is approximated either by permutation or a Gamma approximation. In this paper, a new HSIC-based test is proposed. Its asymptotic null and alternative distributions are established. It is shown that the proposed test is root-n consistent. A three-cumulant matched chi-squared approximation is adopted to approximate the null distribution of the test statistic. By choosing a proper reproducing kernel, the proposed test can be applied to many different types of data including multivariate, high-dimensional, and functional data. Three simulation studies and two real data applications show that in terms of level accuracy, power, and computational cost, the proposed test outperforms several existing tests for multivariate, high-dimensional, and functional data.
翻译:测试两个随机变量之间的依赖性是统计中一个重要的推论问题,因为许多统计程序都依据两个样本是独立的假设。为了测试两个样本是否独立,已经提议了一种所谓的HSIC(Hilbert-Schmidt独立标准)测试。其无效分布或近似于Gamma。本文提出了一个新的基于HSIC的测试。其无症状和替代分布被确定为无症状和替代分布。显示拟议的测试是根一致的。采用了一种三重匹配的圆形近似,以近似测试统计数据的无效分布。通过选择适当的再生产内核,拟议的测试可以适用于许多不同类型的数据,包括多变量、高维和功能数据。三种模拟研究和两个实际数据应用显示,在水平精度、功率和计算成本方面,拟议的测试超过了多种变量、高维度和功能数据的若干现有测试。