In recent years, deep learning has been at the center of analytics due to its impressive empirical success in analyzing complex data objects. Despite this success, most of the existing tools behave like black-box machines, thus the increasing interest in interpretable, reliable, and robust deep learning models applicable to a broad class of applications. Feature-selected deep learning has emerged as a promising tool in this realm. However, the recent developments do not accommodate ultra-high dimensional and highly correlated features, in addition to the high noise level. In this article, we propose a novel screening and cleaning method with the aid of deep learning for a data-adaptive multi-resolutional discovery of highly correlated predictors with a controlled error rate. Extensive empirical evaluations over a wide range of simulated scenarios and several real datasets demonstrate the effectiveness of the proposed method in achieving high power while keeping the false discovery rate at a minimum.
翻译:近年来,由于在分析复杂数据对象方面取得令人印象深刻的经验性成功,深层次的学习一直是分析的中心。尽管取得了这一成功,但大多数现有工具都表现为黑箱机器,因此对适用于广泛应用类别的可解释、可靠和强有力的深层次学习模式的兴趣日益浓厚。特选深层次学习已成为该领域的一个很有希望的工具。然而,除了高噪音水平之外,最近的发展还不能适应超高维和高度关联的特点。在本篇文章中,我们提出一种新的筛选和清洁方法,在深入学习的帮助下,对数据适应性强的多分辨率发现具有控制误差率的高度相关预测器进行数据适应性多分辨率的发现。对广泛的模拟情景和若干真实数据集进行广泛的实证评估,表明拟议方法在尽可能降低虚假发现率的同时实现高功率的有效性。