Deep neural networks (DNNs) have become increasingly popular and achieved outstanding performance in predictive tasks. However, the DNN framework itself cannot inform the user which features are more or less relevant for making the prediction, which limits its applicability in many scientific fields. We introduce neural Gaussian mirrors (NGMs), in which mirrored features are created, via a structured perturbation based on a kernel-based conditional dependence measure, to help evaluate feature importance. We design two modifications of the DNN architecture for incorporating mirrored features and providing mirror statistics to measure feature importance. As shown in simulated and real data examples, the proposed method controls the feature selection error rate at a predefined level and maintains a high selection power even with the presence of highly correlated features.
翻译:深神经网络(DNN)越来越受欢迎,在预测性任务中取得了杰出的成绩;然而,DNN框架本身无法告知用户哪些特征与预测多少相关,从而限制了预测在许多科学领域的适用性;我们引入了神经高斯镜(NGM),通过基于内核的有条件依赖性测量的结构性扰动生成镜像功能,帮助评估特征重要性;我们设计了对DNN结构的两项修改,以纳入镜像特征并提供镜像统计数据,衡量特征重要性;如模拟和真实数据实例所示,拟议方法将特征选择误差率控制在预定水平上,即使存在高度关联的特征,也保持了很高的选择能力。