Replicability is a fundamental quality of scientific discoveries: we are interested in those signals that are detectable in different laboratories, study populations, across time etc. Unlike meta-analysis which accounts for experimental variability but does not guarantee replicability, testing a partial conjunction (PC) null aims specifically to identify the signals that are discovered in multiple studies. In many contemporary applications, e.g., comparing multiple high-throughput genetic experiments, a large number $M$ of PC nulls need to be tested simultaneously, calling for a multiple comparisons correction. However, standard multiple testing adjustments on the $M$ PC $p$-values can be severely conservative, especially when $M$ is large and the signals are sparse. We introduce AdaFilter, a new multiple testing procedure that increases power by adaptively filtering out unlikely candidates of PC nulls. We prove that AdaFilter can control FWER and FDR as long as data across studies are independent, and has much higher power than other existing methods. We illustrate the application of AdaFilter with three examples: microarray studies of Duchenne muscular dystrophy, single-cell RNA sequencing of T cells in lung cancer tumors and GWAS for metabolomics.
翻译:科学发现的基本质量是可复制性:我们感兴趣的是在不同实验室、研究人口、不同时间等不同实验室、不同时间的研究人群中可探测到的信号。与计算实验变异性但不保证可复制性的元分析不同,测试部分结合(PC)无效(PPC)专门旨在确定在多项研究中发现的信号。在许多当代应用中,例如比较多种高通量遗传实验,需要同时测试大量价值的PC无效物,要求进行多重比较校正。然而,对美元PC美元价值的标准多重测试调整可能非常保守,特别是当美元是大而信号稀少时。我们引入了AdaFilter,这是一个新的多重测试程序,通过适应性地过滤不可能得到的PC无效物。我们证明AdaFilter只要各种研究的数据是独立的,就能控制FWER和FDR,并且比其他现有方法的威力要高得多。我们用三个例子来说明AdaFilter的应用:对DAdaFilter的细胞进行显微阵列研究,对磁性肌肉肌肉萎缩性肿瘤和GSRA的单细胞序列进行。