Linear regression models have been extensively considered in the literature. However, in some practical applications they may not be appropriate all over the range of the covariate. In this paper, a more flexible model is introduced by considering a regression model $Y=r(X)+\varepsilon$ where the regression function $r(\cdot)$ is assumed to be linear for large values in the domain of the predictor variable $X$. More precisely, we assume that $r(x)=\alpha_0+\beta_0 x$ for $x> u_0$, where the value $u_0$ is identified as the smallest value satisfying such a property. A penalized procedure is introduced to estimate the threshold $u_0$. The considered proposal focusses on a semiparametric approach since no parametric model is assumed for the regression function for values smaller than $u_0$. Consistency properties of both the threshold estimator and the estimators of $(\alpha_0,\beta_0)$ are derived, under mild assumptions. Through a numerical study, the small sample properties of the proposed procedure and the importance of introducing a penalization are investigated. The analysis of a real data set allows us to demonstrate the usefulness of the penalized estimators.
翻译:暂无翻译