As a computationally fast and working efficient tool, sure independence screening has received much attention in solving ultrahigh dimensional problems. This paper contributes two robust sure screening approaches that simultaneously take into account heteroscedasticity, outliers, heavy-tailed distribution, continuous or discrete response, and confounding effect, from the perspective of model-free. First, we define a robust correlation measure only using two random indicators, and introduce a screener using that correlation. Second, we propose a robust partial correlation-based screening approach when an exposure variable is available. To remove the confounding effect of the exposure on both response and each covariate, we use a nonparametric regression with some specified loss function. More specifically, a robust correlation-based screening method (RC-SIS) and a robust partial correlation-based screening framework (RPC-SIS) including two concrete screeners: RPC-SIS(L2) and RPC-SIS(L1), are formed. Third, we establish sure screening properties of RC-SIS for which the response variable can be either continuous or discrete, as well as those of RPC-SIS(L2) and RPC-SIS(L1) under some regularity conditions. Our approaches are essentially nonparametric, and perform robustly for both the response and the covariates. Finally, extensive simulation studies and two applications are carried out to demonstrate the superiority of our proposed approaches.
翻译:作为计算上快速和起作用的有效工具,在解决超高维度问题时,肯定的独立筛选受到高度重视。本文件提供了两种稳健的可靠筛选方法,从无模型的角度,这些方法既考虑到异差性、异差、重尾分布、连续或离散反应,又考虑到混乱效应。首先,我们只使用两个随机指标确定一个强健的关联度度,并使用这一关联性来引入一个筛选器。第二,我们提议在存在暴露变量时采用一种稳健的、部分基于相关性的筛选方法。为了消除暴露对反应和每个变量的混杂影响,我们使用一种非对称回归法,同时使用某些特定损失功能。更具体地说,一种基于关联的筛选方法(RC-SIS-SIS)和一个基于关联的稳健的部分筛查框架(RPC-SIS-SIS-SIS),包括两个具体的筛选器:RPC-SIS(L2)和RPC-SIS(L1)组成一个筛选器。我们建议确定RCSIS-SIS可持续或离解的应对变量,以及RPC-SISISIS(L2)和RPC-P-PAR-C-C-C-Pral-C-C-SIR-SIR-C-C-I-Pralental-I-I-I-I-I-I-I-I-I-P-I-I-I)两种常规研究,在两种正常进行稳性常规和最后的模拟应用方法,在两种不稳性和模拟应用中进行。