Local variable selection aims to discover localized effects by assessing the impact of covariates on outcomes within specific regions defined by other covariates. We outline some challenges of local variable selection in the presence of non-linear relationships and model misspecification. Specifically, we highlight a potential drawback of common semi-parametric methods: even slight model misspecification can result in a high rate of false positives. To address these shortcomings, we propose a methodology based on orthogonal cut splines that achieves consistent local variable selection in high-dimensional scenarios. Our approach offers simplicity, handles both continuous and discrete covariates, and provides theory for high-dimensional covariates and model misspecification. We discuss settings with either independent or dependent data. Our proposal allows including adjustment covariates that do not undergo selection, enhancing flexibility in modeling complex scenarios. We illustrate its application in simulation studies with both independent and functional data, as well as with two real datasets. One dataset evaluates salary gaps associated with discrimination factors at different ages, while the other examines the effects of covariates on brain activation over time. The approach is implemented in the R package mombf.
翻译:暂无翻译