Controlling the False Discovery Rate (FDR) in a variable selection procedure is critical for reproducible discoveries, and it has been extensively studied in sparse linear models. However, it remains largely open in scenarios where the sparsity constraint is not directly imposed on the parameters but on a linear transformation of the parameters to be estimated. Examples of such scenarios include total variations, wavelet transforms, fused LASSO, and trend filtering. In this paper, we propose a data-adaptive FDR control method, called the Split Knockoff method, for this transformational sparsity setting. The proposed method exploits both variable and data splitting. The linear transformation constraint is relaxed to its Euclidean proximity in a lifted parameter space, which yields an orthogonal design that enables the orthogonal Split Knockoff construction. To overcome the challenge that exchangeability fails due to the heterogeneous noise brought by the transformation, new inverse supermartingale structures are developed via data splitting for provable FDR control without sacrificing power. Simulation experiments demonstrate that the proposed methodology achieves the desired FDR and power. We also provide an application to Alzheimer's Disease study, where atrophy brain regions and their abnormal connections can be discovered based on a structural Magnetic Resonance Imaging dataset (ADNI).
翻译:暂无翻译