Model-X knockoffs is a flexible wrapper method for high-dimensional regression algorithms, which provides guaranteed control of the false discovery rate (FDR). Due to the randomness inherent to the method, different runs of model-X knockoffs on the same dataset often result in different sets of selected variables, which is undesirable in practice. In this paper, we introduce a methodology for derandomizing model-X knockoffs with provable FDR control. The key insight of our proposed method lies in the discovery that the knockoffs procedure is in essence an e-BH procedure. We make use of this connection, and derandomize model-X knockoffs by aggregating the e-values resulting from multiple knockoff realizations. We prove that the derandomized procedure controls the FDR at the desired level, without any additional conditions (in contrast, previously proposed methods for derandomization are not able to guarantee FDR control). The proposed method is evaluated with numerical experiments, where we find that the derandomized procedure achieves comparable power and dramatically decreased selection variability when compared with model-X knockoffs.
翻译:模型- X 的取舍是高维回归算法的一种灵活包装方法,它提供了对虚假发现率(FDR)的可靠控制。由于该方法固有的随机性,在同一数据集上不同的模型-X取舍结果往往导致不同的选定变量,而在实践中这是不可取的。在本文件中,我们采用了一种方法,用可辨别FDR控制法来拆解模型- X取舍。我们拟议方法的关键见解在于发现击退程序本质上是一个e-BH程序。我们利用了这一连接,并且通过将多次击落实现后产生的电子价值合并而使模型-X取舍。我们证明,在理想水平上,解禁程序控制FDR,没有附加任何条件(相反的是,先前提议的解密方法无法保证FDR控制)。我们用数字实验对拟议方法进行了评估,我们认为,与模型-X取舍相比,脱钩程序具有相似的力量,选择变异性显著减少。