We develop statistical models for samples of distribution-valued stochastic processes through time-varying optimal transport process representations under the Wasserstein metric when the values of the process are univariate distributions. While functional data analysis provides a toolbox for the analysis of samples of real- or vector-valued processes, there is at present no coherent statistical methodology available for samples of distribution-valued processes, which are increasingly encountered in data analysis. To address the need for such methodology, we introduce a transport model for samples of distribution-valued stochastic processes that implements an intrinsic approach whereby distributions are represented by optimal transports. Substituting transports for distributions addresses the challenge of centering distribution-valued processes and leads to a useful and interpretable representation of each realized process by an overall transport and a real-valued trajectory, utilizing a scalar multiplication operation for transports. This representation facilitates a connection to Gaussian processes that proves useful, especially for the case where the distribution-valued processes are only observed on a sparse grid of time points. We study the convergence of the key components of the proposed representation to their population targets and demonstrate the practical utility of the proposed approach through simulations and application examples.
翻译:暂无翻译