The analysis of samples of random objects that do not lie in a vector space is gaining increasing attention in statistics. An important class of such object data is univariate probability measures defined on the real line. Adopting the Wasserstein metric, we develop a class of regression models for such data, where random distributions serve as predictors and the responses are either also distributions or scalars. To define this regression model, we utilize the geometry of tangent bundles of the space of random measures endowed with the Wasserstein metric for mapping distributions to tangent spaces. The proposed distribution-to-distribution regression model provides an extension of multivariate linear regression for Euclidean data and function-to-function regression for Hilbert space valued data in functional data analysis. In simulations, it performs better than an alternative transformation approach where one maps distributions to a Hilbert space through the log quantile density transformation and then applies traditional functional regression. We derive asymptotic rates of convergence for the estimator of the regression operator and for predicted distributions and also study an extension to autoregressive models for distribution-valued time series. The proposed methods are illustrated with data on human mortality and distributional time series of house prices.
翻译:对不属于矢量空间的随机天体样本的分析在统计中日益受到越来越多的关注。这类天体数据的一个重要类别是实际线上定义的单亚值概率度度测量。采用瓦西施泰因指标,我们为这些数据开发了一组回归模型,随机分布作为预测器,而答复则是分布或斜线。为了定义这一回归模型,我们使用与瓦西斯坦测量标准相匹配的空间空间空间的几何测量法,以绘制向正向空间分布的分布图。拟议的分布到分布回归模型为欧克利甸数据提供了扩展,以及功能数据分析中希尔伯特空间估值数据的功能到功能回归。在模拟中,它的表现优于一种替代的转换方法,即通过圆角密度转换向希尔伯特空间分布地图,然后采用传统的功能回归模型回归模型。我们从微调的回归操作器和预测分布分布到分布分布分布分布的扩展,还研究了分配时间序列的自动递增模型。拟议的方法是用分布时间序列和人类分布数据显示的方法。