We consider the problem of multiple hypothesis testing when there is a logical nested structure to the hypotheses. When one hypothesis is nested inside another, the outer hypothesis must be false if the inner hypothesis is false. We model the nested structure as a directed acyclic graph, including chain and tree graphs as special cases. Each node in the graph is a hypothesis and rejecting a node requires also rejecting all of its ancestors. We propose a general framework for adjusting node-level test statistics using the known logical constraints. Within this framework, we study a smoothing procedure that combines each node with all of its descendants to form a more powerful statistic. We prove a broad class of smoothing strategies can be used with existing selection procedures to control the familywise error rate, false discovery exceedance rate, or false discovery rate, so long as the original test statistics are independent under the null. When the null statistics are not independent but are derived from positively-correlated normal observations, we prove control for all three error rates when the smoothing method is arithmetic averaging of the observations. Simulations and an application to a real biology dataset demonstrate that smoothing leads to substantial power gains.
翻译:当假设存在逻辑嵌套结构时,我们考虑多个假设测试的问题。当一个假设嵌入另一个假设时,如果内假设是假的,外部假设必须是虚假的。我们把嵌套结构模拟成定向环状图,包括链条和树图作为特殊情况。图表中的每个节点都是假设,拒绝一个节点也要求拒绝所有先辈。我们提出了一个使用已知逻辑限制调整节点水平测试统计数据的一般框架。在这个框架内,我们研究一个将每个节点与所有后代结合起来的平滑程序,以形成一个更强有力的统计数据。我们证明,可以利用现有的选择程序使用广泛的平滑战略类别来控制家族错误率、错误发现超速率或虚假发现率,只要原始测试统计数据在无效状态下是独立的。当无效统计不是独立而是从正相关正常观察中得出时,当平滑方法是观测的算平均值时,我们证明所有三个错误率都有控制权。模拟和对真实生物学数据集的应用表明,平滑导致实质性的能量增益。