As an effective nonparametric method, empirical likelihood (EL) is appealing in combining estimating equations flexibly and adaptively for incorporating data information. To select important variables and estimating equations in the sparse high-dimensional model, we consider a penalized EL method based on robust estimating functions by applying two penalty functions for regularizing the regression parameters and the associated Lagrange multipliers simultaneously, which allows the dimensionalities of both regression parameters and estimating equations to grow exponentially with the sample size. A first inspection on the robustness of estimating equations contributing to the estimating equations selection and variable selection is discussed from both theoretical perspective and intuitive simulation results in this paper. The proposed method can improve the robustness and effectiveness when the data have underlying outliers or heavy tails in the response variables and/or covariates. The robustness of the estimator is measured via the bounded influence function, and the oracle properties are also established under some regularity conditions. Extensive simulation studies and a yeast cell data are used to evaluate the performance of the proposed method. The numerical results reveal that the robustness of sparse estimating equations selection fundamentally enhances variable selection accuracy when the data have heavy tails and/or include underlying outliers.
翻译:作为一种有效的非参数方法,实证可能性(EL)在以灵活和适应的方式综合估算方程以纳入数据信息方面具有吸引力。为了在稀疏的高维模型中选择重要的变量和估计方程,我们考虑一种基于稳健估算功能的受罚EL方法,方法是同时应用两个惩罚功能,使回归参数和相关的拉格朗乘数正规化,使回归参数和估计方程的维度能够随着抽样规模而成倍增长。从理论角度和直观模拟结果的角度,对有助于估算方程选择和变量选择的估算方程的稳健性进行了第一次检查。在反应变量和/或共变式中,如果数据根底有外向或重尾部,拟议方法的稳健性将提高稳健性和有效性。通过受约束的影响力函数测量了估计方程的稳健性,并在某些定期条件下也确定了矩形特性。使用广泛的模拟研究和酵色细胞数据来评价拟议方法的性能。数字结果显示,在数据有重尾部时,稀暗的估算方程选择将从根本上加强变量的准确性。