Missing data often significantly hamper standard time series analysis, yet in practice they are frequently encountered. In this paper, we introduce temporal Wasserstein imputation, a novel method for imputing missing data in time series. Unlike existing techniques, our approach is fully nonparametric, circumventing the need for model specification prior to imputation, making it suitable for potential nonlinear dynamics. Its principled algorithmic implementation can seamlessly handle univariate or multivariate time series with any missing pattern. In addition, the plausible range and side information of the missing entries (such as box constraints) can easily be incorporated. As a key advantage, our method mitigates the distributional bias typical of many existing approaches, ensuring more reliable downstream statistical analysis using the imputed series. Leveraging the benign landscape of the optimization formulation, we establish the convergence of an alternating minimization algorithm to critical points. Furthermore, we provide conditions under which the marginal distributions of the underlying time series can be identified. Our numerical experiments, including extensive simulations covering linear and nonlinear time series models and an application to a real-world groundwater dataset laden with missing data, corroborate the practical usefulness of the proposed method.
翻译:暂无翻译