In this paper, we propose a novel Euclidean-distance-based coefficient, named differential distance correlation, to measure the strength of dependence between a random variable $ Y \in \mathbb{R} $ and a random vector $ \boldsymbol{X} \in \mathbb{R}^p $. The coefficient has a concise expression and is invariant to arbitrary orthogonal transformations of the random vector. Moreover, the coefficient is a strongly consistent estimator of a simple and interpretable dependent measure, which is 0 if and only if $ \boldsymbol{X} $ and $ Y $ are independent and equal to 1 if and only if $ Y $ determines $ \boldsymbol{X} $ almost surely. An alternative approach is also proposed to address the limitation that the coefficient is non-robust to outliers. Furthermore, the coefficient exhibits asymptotic normality with a simple variance under the independent hypothesis, facilitating fast and accurate estimation of $ p $-value for testing independence. Three simulation experiments show that the proposed coefficient is more computationally efficient for independence testing and more effective in detecting oscillatory relationships than several competing methods. We also apply our method to analyze a real data example.
翻译:本文提出了一种基于欧氏距离的新颖系数,称为微分距离相关性,用于衡量随机变量 $ Y \\in \\mathbb{R} $ 与随机向量 $ \\boldsymbol{X} \\in \\mathbb{R}^p $ 之间的依赖强度。该系数具有简洁的表达式,且对随机向量的任意正交变换具有不变性。此外,该系数是一个简单且可解释的依赖度量的强一致估计量:当且仅当 $ \\boldsymbol{X} $ 与 $ Y $ 独立时其值为 0,当且仅当 $ Y $ 几乎必然决定 $ \\boldsymbol{X} $ 时其值为 1。本文还提出了一种替代方法,以解决该系数对异常值不稳健的局限性。进一步地,在独立假设下,该系数展现出渐近正态性且具有简单方差形式,便于快速准确地估计独立性检验的 $ p $ 值。三项模拟实验表明,与多种竞争方法相比,所提系数在独立性检验中计算效率更高,且在检测振荡关系方面更为有效。我们还将该方法应用于一个实际数据分析案例。