The model-X knockoffs framework provides a flexible tool for achieving finite-sample false discovery rate (FDR) control in variable selection in arbitrary dimensions without assuming any dependence structure of the response on covariates. It also completely bypasses the use of conventional p-values, making it especially appealing in high-dimensional nonlinear models. Existing works have focused on the setting of independent and identically distributed observations. Yet time series data is prevalent in practical applications in various fields such as economics and social sciences. This motivates the study of model-X knockoffs inference for time series data. In this paper, we make some initial attempt to establish the theoretical and methodological foundation for the model-X knockoffs inference for time series data. We suggest the method of time series knockoffs inference (TSKI) by exploiting the ideas of subsampling and e-values to address the difficulty caused by the serial dependence. We also generalize the robust knockoffs inference to the time series setting and relax the assumption of known covariate distribution required by model-X knockoffs, because such an assumption is overly stringent for time series data. We establish sufficient conditions under which TSKI achieves the asymptotic FDR control. Our technical analysis reveals the effects of serial dependence and unknown covariate distribution on the FDR control. We conduct power analysis of TSKI using the Lasso coefficient difference knockoff statistic under linear time series models. The finite-sample performance of TSKI is illustrated with several simulation examples and an economic inflation study.
翻译:暂无翻译