We study the problem of variable selection in convex nonparametric least squares (CNLS). Whereas the least absolute shrinkage and selection operator (Lasso) is a popular technique for least squares, its variable selection performance is unknown in CNLS problems. In this work, we investigate the performance of the Lasso CNLS estimator and find out it is usually unable to select variables efficiently. Exploiting the unique structure of the subgradients in CNLS, we develop a structured Lasso by combining $\ell_1$-norm and $\ell_{\infty}$-norm. To improve its predictive performance, we propose a relaxed version of the structured Lasso where we can control the two effects--variable selection and model shrinkage--using an additional tuning parameter. A Monte Carlo study is implemented to verify the finite sample performances of the proposed approaches. In the application of Swedish electricity distribution networks, when the regression model is assumed to be semi-nonparametric, our methods are extended to the doubly penalized CNLS estimators. The results from the simulation and application confirm that the proposed structured Lasso performs favorably, generally leading to sparser and more accurate predictive models, relative to the other variable selection methods in the literature.
翻译:暂无翻译