We propose a new empirical Bayes method for covariate-assisted multiple testing with false discovery rate (FDR) control, where we model the local false discovery rate for each hypothesis as a function of both its covariates and p-value. Our method refines the adaptive p-value thresholding (AdaPT) procedure by generalizing its masking scheme to reduce the bias and variance of its false discovery proportion estimator, improving the power when the rejection set is small or some null p-values concentrate near 1. We also introduce a Gaussian mixture model for the conditional distribution of the test statistics given covariates, modeling the mixing proportions with a generic user-specified classifier, which we implement using a two-layer neural network. Like AdaPT, our method provably controls the FDR in finite samples even if the classifier or the Gaussian mixture model is misspecified. We show in extensive simulations and real data examples that our new method, which we call AdaPT-GMM, consistently delivers high power relative to competing state-of-the-art methods. In particular, it performs well in scenarios where AdaPT is underpowered, and is especially well-suited for testing composite null hypothesis, such as whether the effect size exceeds a practical significance threshold.
翻译:我们提出了一个新的实验性贝耶斯方法,用于以假发现率(FDR)控制来进行共变辅助多重测试,其中我们将每种假设的当地虚假发现率作为共同变数和p值的函数。我们的方法改进了适应性P价值阈值(AdaPT)程序,其方法是推广其掩码方法,以减少其虚假发现比例估计值的偏差和差异,当拒绝数据集小于或某些无效的p值集中接近1时,则提高力量。 我们还采用了高斯混合模型,以有条件地分配给同变数的测试统计数据,用通用用户指定的分类器来模拟混合比例,我们使用双层神经网络来实施。与AdaPT一样,我们的方法在特定样本中可以明显控制FDR,即使分类器或高斯混合模型描述错误。我们在广泛的模拟和真实数据实例中显示,我们称之为ADAPT-GMM的新方法, 持续提供与竞争状态方法相对的高权力。特别是,它在模型的模型中,是否具有非常强的复合性临界值,在模型之下,是否具有很高的模型价值。