In the framework of scalar-on-function regression models, in which several functional variables are employed to predict a scalar response, we propose a methodology for selecting relevant functional predictors while simultaneously providing accurate smooth (or, more generally, regular) estimates of the functional coefficients. We suppose that the functional predictors belong to a real separable Hilbert space, while the functional coefficients belong to a specific subspace of this Hilbert space. Such a subspace can be a Reproducing Kernel Hilbert Space (RKHS) to ensure the desired regularity characteristics, such as smoothness or periodicity, for the coefficient estimates. Our procedure, called SOFIA (Scalar-On-Function Integrated Adaptive Lasso), is based on an adaptive penalized least squares algorithm that leverages functional subgradients to efficiently solve the minimization problem. We demonstrate that the proposed method satisfies the functional oracle property, even when the number of predictors exceeds the sample size. SOFIA's effectiveness in variable selection and coefficient estimation is evaluated through extensive simulation studies and a real-data application to GDP growth prediction.
翻译:在标量对函数回归模型的框架下,我们提出了一种方法,用于选择相关的函数预测变量,同时为函数系数提供准确的光滑(或更一般地,正则)估计。该模型使用多个函数变量来预测标量响应。我们假设函数预测变量属于实可分希尔伯特空间,而函数系数属于该希尔伯特空间的一个特定子空间。该子空间可以是一个再生核希尔伯特空间(RKHS),以确保系数估计具有所需的正则特性,例如光滑性或周期性。我们的方法称为SOFIA(标量对函数集成自适应Lasso),它基于一种自适应惩罚最小二乘算法,该算法利用函数次梯度来高效求解最小化问题。我们证明了所提方法满足函数oracle性质,即使在预测变量数量超过样本量的情况下也是如此。通过广泛的模拟研究和对GDP增长预测的实际数据应用,评估了SOFIA在变量选择和系数估计方面的有效性。