A novel non-parametric estimator of the correlation between grouped measurements of a quantity is proposed in the presence of noise. This work is primarily motivated by functional brain network construction from fMRI data, where brain regions correspond to groups of spatial units, and correlation between region pairs defines the network. The challenge resides in the fact that both noise and intra-regional correlation lead to inconsistent inter-regional correlation estimation using classical approaches. While some existing methods handle either one of these issues, no non-parametric approaches tackle both simultaneously. To address this problem, we propose a trade-off between two procedures: correlating regional averages, which is not robust to intra-regional correlation; and averaging pairwise inter-regional correlations, which is not robust to noise. To that end, we project the data onto a space where Euclidean distance is used as a proxy for sample correlation. We then propose to leverage hierarchical clustering to gather together highly correlated variables within each region prior to inter-regional correlation estimation. We provide consistency results, and empirically show our approach surpasses several other popular methods in terms of quality. We also provide illustrations on real-world datasets that further demonstrate its effectiveness.
翻译:在出现噪音的情况下,提出了对数量进行分组测量之间相互关系的新的非参数性估计。这项工作主要是由FMRI数据的功能性脑网络建设驱动的,其中脑区域与空间单位组相对应,区域对口对口关系对网络的定义。挑战在于,噪音和区域内部相关关系导致使用传统方法对区域间相关性的估算不一致。虽然有些现有方法处理其中任何一个问题,但没有非参数方法同时处理这两个问题。为了解决这一问题,我们提议在两个程序之间进行权衡:关联的区域平均数(与区域内部相关性不牢固);以及平均对对对口区域间关联(与噪音不牢固)。为此,我们将数据投放到一个空间上,将欧几里德距离用作样本关联的替代物。我们然后提议利用等级组合在各区域内部收集高度关联的变量,然后进行区域间相关估算。我们提供了一致性结果,从经验上显示我们的方法在质量上超过了其他几个流行的方法。我们还提供了实际世界数据设置的示例,进一步展示其有效性。