Multiple hypothesis testing has been widely applied to problems dealing with high-dimensional data, e.g., selecting significant variables and controlling the selection error rate. The most prevailing measure of error rate used in the multiple hypothesis testing is the false discovery rate (FDR). In recent years, local false discovery rate (fdr) has drawn much attention, due to its advantage of accessing the confidence of individual hypothesis. However, most methods estimate fdr through p-values or statistics with known null distributions, which are sometimes not available or reliable. Adopting the innovative methodology of competition-based procedures, e.g., knockoff filter, this paper proposes a new approach, named TDfdr, to local false discovery rate estimation, which is free of the p-values or known null distributions. Simulation results demonstrate that TDfdr can accurately estimate the fdr with two competition-based procedures. In real data analysis, the power of TDfdr on variable selection is verified on two biological datasets.
翻译:多重假设测试被广泛应用于涉及高维数据的问题,例如选择重要的变量和控制选择错误率。在多个假设测试中,最普遍的误差率衡量方法是假发现率(FDR)。近年来,当地虚假发现率(fdr)由于具有获得个人假设信任的优势而引起极大注意。然而,大多数方法通过p-value或已知无效分布的统计估计fdr fdr,有时无法获取或可靠。本文采用了基于竞争的程序的创新方法,例如,淘汰过滤器。本文提出了一种新的方法,即称为TDfdr, 当地虚假发现率估计法,该方法没有p-value或已知的无效分布法。模拟结果表明,TDfdr可以用两种基于竞争的程序准确估计fdr。在实际数据分析中,TDfdr对变量选择的力量在两个生物数据集上得到验证。