Recently, Chatterjee (2021) introduced a new rank-based correlation coefficient which can be used to measure the strength of dependence between two random variables. This coefficient has already attracted much attention as it converges to the Dette-Siburg-Stoimenov measure (see Dette et al. (2013)), which equals $0$ if and only if the variables are independent and $1$ if and only if one variable is a function of the other. Further, Chatterjee's coefficient is computable in (near) linear time, which makes it appropriate for large-scale applications. In this paper, we expand the theoretical understanding of Chatterjee's coefficient in two directions: (a) First we consider the problem of testing for independence using Chatterjee's correlation. We obtain its asymptotic distribution under any changing sequence of alternatives converging to the null hypothesis (of independence). We further obtain a general result that gives exact detection thresholds and limiting power for Chatterjee's test of independence under natural nonparametric alternatives converging to the null. As applications of this general result, we prove a $n^{-1/4}$ detection boundary for this test and compute explicitly the limiting local power on the detection boundary for popularly studied alternatives in the literature. (b) We then construct a test for non-trivial levels of dependence using Chatterjee's coefficient. In contrast to testing for independence, we prove that, in this case, Chatterjee's coefficient indeed yields a minimax optimal procedure with a $n^{-1/2}$ detection boundary. Our proof techniques rely on Stein's method of exchangeable pairs, a non-asymptotic projection result, and information theoretic lower bounds.
翻译:暂无翻译