High-dimensional mean vector testing problem for two or more groups remain a very active research area. In these setting, traditional tests are not applicable because they involve the inversion of rank deficient group covariance matrix. In current approaches, this problem is addressed by simply looking at a test assuming a sparse or diagonal covariance matrix potentially ignoring complex dependency between features. In this paper, we develop a Bayes factor (BF) based testing procedure for comparing two or more population means in (very) high dimensional settings. Two versions of the Bayes factor based test statistics are considered which are based on a Random projection (RP) approach. RPs are appealing since they make not assumption about the form of the dependency across features in the data. The final test statistic is based on an ensemble of Bayes factors corresponding to multiple replications of randomly projected data. Both proposed test statistics are compared through a battery of simulation settings. Finally they are applied to the analysis of a publicly available genomic single cell RNA-seq (scRNA-seq) dataset.
翻译:两种或两种以上组群的高维中位矢量测试问题仍然是一个非常活跃的研究领域。在这些环境中,传统测试并不适用,因为它们涉及排位不足的群居共变矩阵的反转。在目前的方法中,这一问题的解决只是通过假设一个测试,假设一个稀疏或对等的共变矩阵可能忽略不同特征之间的复杂依赖性。在本文件中,我们开发了一个基于贝亚因(BF)的测试程序,用于比较(非常)高维环境中两种或两种以上人口手段。两种基于贝亚系数的测试统计数据都以随机投影(RP)方法为基础。RPS具有吸引力,因为它们没有假设数据各特征之间依赖性的形式。最后的测试统计基于一系列与随机预测数据重复相对应的海湾因素。两种拟议的测试统计数据都是通过模拟设置的电池进行比较的。最后,它们被用于分析公开提供的单细胞RNA-seq(scRNA-seq)数据集。