We consider the problem of variable screening in ultra-high dimensional generalized linear models (GLMs) of non-polynomial orders. Since the popular SIS approach is extremely unstable in the presence of contamination and noise, we discuss a new robust screening procedure based on the minimum density power divergence estimator (MDPDE) of the marginal regression coefficients. Our proposed screening procedure performs well under pure and contaminated data scenarios. We provide a theoretical motivation for the use of marginal MDPDEs for variable screening from both population as well as sample aspects; in particular, we prove that the marginal MDPDEs are uniformly consistent leading to the sure screening property of our proposed algorithm. Finally, we propose an appropriate MDPDE based extension for robust conditional screening in GLMs along with the derivation of its sure screening property. Our proposed methods are illustrated through extensive numerical studies along with an interesting real data application.
翻译:我们考虑了在超高维通用线性模型(GLMs)中对非球状线性命令进行可变筛选的问题;由于流行的SIS方法在存在污染和噪音的情况下极不稳定,我们讨论基于边际回归系数最小密度差估计值(MDPDE)的新的稳健筛选程序;我们提议的筛选程序在纯度和受污染的数据假设下运作良好;我们为利用边际的MDPDE从人口和抽样两方面进行可变筛选提供了理论动机;特别是,我们证明边际的MDPDE方法一致地导致了我们提议的算法的可靠筛选属性;最后,我们提议以适当的MDPDE为基础的适当扩展,以便在GLMMS进行稳健的有条件筛选,同时推断其肯定的筛选属性;我们提出的方法通过广泛的数字研究以及有趣的真实数据应用来加以说明。