In partially linear additive models the response variable is modelled with a linear component on a subset of covariates and an additive component in which the rest of the covariates enter to the model as a sum of univariate unknown functions. This structure is more flexible than the usual full linear or full nonparametric regression models, avoids the 'curse of dimensionality', is easily interpretable and allows the user to include discrete or categorical variables in the linear part. On the other hand, in practice, the user incorporates all the available variables in the model no matter how they would impact on the response variable. For this reason, variable selection plays an important role since including covariates that has a null impact on the responses will reduce the prediction capability of the model. As in other settings, outliers in the data may harm estimations based on strong assumptions, such as normality of the response variable, leading to conclusions that are not representative of the data set. In this work, we propose a family of robust estimators that estimate and select variables from both the linear and the additive part of the model simultaneously. This family considers an adaptive procedure on a general class of penalties in the regularization part of the objetive function that defines the estimators. We study the behaviour of the proposal againts its least-squares counterpart under simulations and show the advantages of its use on a real data set.
翻译:暂无翻译