实现无空无空的无控地物选择流程正常化 (Normalizing Flows for Knockoff-free Controlled Feature Selection)

Controlled feature selection aims to discover the features a response depends on while limiting the false discovery rate (FDR) to a predefined level. Recently, multiple deep-learning-based methods have been proposed to perform controlled feature selection through the Model-X knockoff framework. We demonstrate, however, that these methods often fail to control the FDR for two reasons. First, these methods often learn inaccurate models of features. Second, the "swap" property, which is required for knockoffs to be valid, is often not well enforced. We propose a new procedure called FlowSelect to perform controlled feature selection that does not suffer from either of these two problems. To more accurately model the features, FlowSelect uses normalizing flows, the state-of-the-art method for density estimation. Instead of enforcing the "swap" property, FlowSelect uses a novel MCMC-based procedure to calculate p-values for each feature directly. Asymptotically, FlowSelect computes valid p-values. Empirically, FlowSelect consistently controls the FDR on both synthetic and semi-synthetic benchmarks, whereas competing knockoff-based approaches do not. FlowSelect also demonstrates greater power on these benchmarks. Additionally, FlowSelect correctly infers the genetic variants associated with specific soybean traits from GWAS data.

翻译：受控特性选择旨在发现响应取决于的特征,同时将虚假发现率限制在预先定义的水平上,同时要发现响应取决于的特征。最近,提出了多项基于深学习的多种方法,以通过模型-X的淘汰框架进行受控特性选择。然而,我们证明,这些方法往往由于两个原因无法控制FDR。首先,这些方法往往学习不准确的特征模型。第二,“擦拭”属性(这是取舍有效所需的)往往没有得到很好执行。我们提议了一种名为 FlowSelect 的新程序,以进行不受这两个问题影响的受控特性选择。为了更准确地模拟这些特性,FlowSelect使用正常的流量,即密度估计的最先进的方法。除了执行“擦拭”属性外,FDRS选择还使用一种基于新式的基于 MC 程序直接计算每个特性的 p价值。亚性、流选计算有效 pvaluements。我们提议了一个名为 FDRD(FDR) 的功能选择,既不受这两个问题的影响,又不受这两个问题的任何影响。为了更精确的合成和半合成合成特征选择性特征选择性特征选择,将使用Slest- slest-relect slevew press relateal press press press press press press press the the the the the silent srelectalbilectalbilentalbildalbildalbildalbildalbildalgildaldaldalgildaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldsaldsaldaldaldaldaldaldaldaldaldalds praldalds praldaldaldaldaldaldaldaldaldaldalds praldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldalds 方法来控制, 方法, se

相关内容

特征选择

关注 5931

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

最近几种小样本元学习简明综述，A Concise Review of Recent Few-shot Meta-learning Methods

专知会员服务

35+阅读 · 2020年5月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日