We extend the scope of Azadkia-Chatterjee's dependence coefficient between a scalar response $Y$ and a multivariate covariate $X$ to the case where $X$ takes values in a general metric space. Particular attention is paid to the case where $X$ is a curve. Although extending this framework at the population level is relatively straightforward, analyzing the asymptotic behavior of the estimator proves to be complex. This complexity is largely related to the nearest neighbor structure of the infinite-dimensional covariate sample, leading us to explore a topic that has not been previously addressed in the literature. The primary contribution of this paper is to provide insights into this issue and propose strategies to address it. Our findings also have significant implications for other graph-based methods facing similar challenges.
翻译:暂无翻译