Adaptive multiple testing with covariates is an important research direction that has gained major attention in recent years. It has been widely recognized that leveraging side information provided by auxiliary covariates can improve the power of false discovery rate (FDR) procedures. Currently, most such procedures are devised with $p$-values as their main statistics. However, for two-sided hypotheses, the usual data processing step that transforms the primary statistics, known as $z$-values, into $p$-values not only leads to a loss of information carried by the main statistics, but can also undermine the ability of the covariates to assist with the FDR inference. We develop a $z$-value based covariate-adaptive (ZAP) methodology that operates on the intact structural information encoded jointly by the $z$-values and covariates. It seeks to emulate the oracle $z$-value procedure via a working model, and its rejection regions significantly depart from those of the $p$-value adaptive testing approaches. The key strength of ZAP is that the FDR control is guaranteed with minimal assumptions, even when the working model is misspecified. We demonstrate the state-of-the-art performance of ZAP using both simulated and real data, which shows that the efficiency gain can be substantial in comparison with $p$-value based methods. Our methodology is implemented in the $\texttt{R}$ package $\texttt{zap}$.
翻译:使用共变数的适应性多重测试是一个重要的研究方向,近年来引起了人们的极大关注。人们广泛认识到,利用辅助共变提供的侧边信息可以提高假发现率(FDR)程序的力量。目前,大多数此类程序都是用美元价值设计的,而其主要统计数据则是用美元价值(ZAP)共同编码的完整结构信息设计出来的。但是,对于双面假设,通常的数据处理步骤将被称为z美元价值的原始统计数据转换成美元价值程序,这种步骤不仅导致损失主要统计数据所传播的信息,而且还会削弱辅助共变数协助FDR推断的能力。我们开发了一个以美元价值为基础的基于共变换-适应(ZAP)程序(ZAP)方法,该方法以美元价值为单位,用美元价值(z-美元)和变换值(ZAP)共同编码的完整结构信息运作。它试图通过一种工作模型来模仿美元价值(z)程序,其拒绝区域与美元价值调整测试方法大大偏离。ZAP的关键力量是,以美元值为基础的FDRDR控制以最低的美元价值为基数,即便在模拟方法中也用实际方法展示。