Testing the independence between random vectors is a fundamental problem in statistics. Distance correlation, a recently popular dependence measure, is universally consistent for testing independence against all distributions with finite moments. However, when data are subject to selection bias or collected from multiple sources or schemes, spurious dependence may arise. This creates a need for methods that can effectively utilize data from different sources and correct these biases. In this paper, we study the estimation of distance covariance and distance correlation under multiple biased sampling models, which provide a natural framework for addressing these issues. Theoretical properties, including the strong consistency and asymptotic null distributions of the distance covariance and correlation estimators, and the rate at which the test statistic diverges under sequences of alternatives approaching the null, are established. A weighted permutation procedure is proposed to determine the critical value of the independence test. Simulation studies demonstrate that our approach improves both the estimation of distance correlation and the power of the test.
翻译:暂无翻译