Consider a simultaneous hypothesis testing problem where each hypothesis is associated with a test statistic. Suppose it is difficult to obtain the null distribution of the test statistics, but some null hypotheses--referred to as the internal negative controls--are known to be true. When it is reasonable to assume that the test statistics associated with the negative controls are exchangeable with those associated with the unknown true null hypotheses, we propose to use a statistic's Rank Among Negative Controls (RANC) as a p-value for the corresponding hypothesis. We provide two theoretical prospectives on this proposal. First, we view the empirical distribution of the negative control statistics as an estimate of the null distribution. We use this to show that, when the test statistics are exchangeable, the RANC p-values are individually valid and have a positive regression dependence on the subset of true nulls. Second, we study the empirical processes of the test statistics indexed by the rejection threshold. We use this to show that the Benjamini-Hochberg procedure applied to the RANC p-values may still control the false discovery rate when the test statistics are not exchangeable. The practical performance of our method is illustrated using numerical simulations and a real proteomic dataset.
翻译:在每种假设都与测试统计相关的情况下,考虑同时的假设测试问题。 假设很难获得测试统计数据的无效分布, 但有些空虚假设被称为内部负控制- 已知是真实的。 当有理由假定与负控制相关的测试统计数据可以与未知真实无效假设相关的测试统计数据互换时, 我们提议使用一个统计在负控制中的排名( RANC)作为相应假设的 p值。 我们对此提议提供了两个理论前景。 首先, 我们认为负控制统计数据的实际分布是无效分布的估计。 我们用这个假设来显示, 当测试统计数据可以互换时, RANC P价值是个别有效的, 并且对真实无效的组合具有积极的回归依赖性。 其次, 我们用拒绝阈值来研究测试统计数据的实验过程。 我们用这个方法来证明, 适用于RANC P- 数值的本杰明- 霍奇伯格程序仍然可以控制测试统计数据无法互换时的虚假发现率。 我们的方法的实际表现是使用数字模拟和真实的数据。</s>