We consider the inference problem for high-dimensional linear models, when covariates have an underlying spatial organization reflected in their correlation. A typical example of such a setting is high-resolution imaging, in which neighboring pixels are usually very similar. Accurate point and confidence intervals estimation is not possible in this context with many more covariates than samples, furthermore with high correlation between covariates. This calls for a reformulation of the statistical inference problem, that takes into account the underlying spatial structure: if covariates are locally correlated, it is acceptable to detect them up to a given spatial uncertainty. We thus propose to rely on the $\delta$-FWER, that is the probability of making a false discovery at a distance greater than $\delta$ from any true positive. With this target measure in mind, we study the properties of ensembled clustered inference algorithms which combine three techniques: spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions. We show that ensembled clustered inference algorithms control the $\delta$-FWER under standard assumptions for $\delta$ equal to the largest cluster diameter. We complement the theoretical analysis with empirical results, demonstrating accurate $\delta$-FWER control and decent power achieved by such inference algorithms.
翻译:我们考虑了高维线性模型的推论问题,当共差具有反映于其相关性的内在空间组织时,共差会考虑高维线性模型的推论问题。这种环境的一个典型例子是高分辨率成像,其中相邻的像素通常非常相似。在这种情况下,不可能用比样本多得多的共差来精确点和信任间隔估计,此外,共差之间也有高度的相互关系。这要求重新研究统计推论问题,其中考虑到基本空间结构:如果共差与当地相关,则可以根据特定空间不确定性来检测它们。因此,我们提议依赖美元/delta$-FWER,这是在距离大于美元/delta$与任何真正正数的距离上进行虚假发现的可能性。我们考虑到这个目标测量,我们研究的是集聚集集的集算法的特性,这些算法结合了三种技术:空间受限制的集成、统计推论的推论,以及聚合若干组集的推论解决办法。我们表明,在美元-美元的基集算法中,将组合组合组合算算出美元控制到最高直径直径直径的方值。我们方的IF值分析,我们用等的理论测算法对等的推算结果进行了。