This paper proposes a new robust smooth-threshold estimating equation to select important variables and automatically estimate parameters for high dimensional longitudinal data. A novel working correlation matrix is proposed to capture correlations within the same subject. The proposed procedure works well when the number of covariates p increases as the number of subjects n increases. The proposed estimates are competitive with the estimates obtained with the true correlation structure, especially when the data are contaminated. Moreover, the proposed method is robust against outliers in the response variables and/or covariates. Furthermore, the oracle properties for robust smooth-threshold estimating equations under "large n, diverging p" are established under some regularity conditions. Extensive simulation studies and a yeast cell cycle data are used to evaluate the performance of the proposed method, and results show that our proposed method is competitive with existing robust variable selection procedures.
翻译:本文提出了一个新的稳健的平坦阈值估计方程,以选择重要变量,并自动估算高维纵向数据参数。提出了一个新的工作关联矩阵,以捕捉同一主题的关联性。当共变p数量增加时,拟议的程序随着主题n的增加而运作良好。提议的估算与以真实关联结构获得的估算相比具有竞争力,特别是在数据受到污染时。此外,拟议的方法对反应变量和/或共变中的外部线十分有力。此外,在某些常规条件下,在“大n,差异p”下,为稳健的平稳阈值估算方程建立了极值特性。使用广泛的模拟研究和酵母细胞周期数据来评估拟议方法的绩效,结果显示,我们拟议的方法与现有的稳健的变量选择程序具有竞争力。