Assume that we observe i.i.d. points lying close to some unknown $d$-dimensional $\mathcal{C}^k$ submanifold $M$ in a possibly high-dimensional space. We study the problem of reconstructing the probability distribution generating the sample. After remarking that this problem is degenerate for a large class of standard losses ($L_p$, Hellinger, total variation, etc.), we focus on the Wasserstein loss, for which we build an estimator, based on kernel density estimation, whose rate of convergence depends on $d$ and the regularity $s\leq k-1$ of the underlying density, but not on the ambient dimension. In particular, we show that the estimator is minimax and matches previous rates in the literature in the case where the manifold $M$ is a $d$-dimensional cube. The related problem of the estimation of the volume measure of $M$ for the Wasserstein loss is also considered, for which a minimax estimator is exhibited.
翻译:假设我们观察的是i. i.d. points 接近某个未知的维元$mathcal{C ⁇ k$ Submany point $M$ 在可能高的维度空间中。我们研究了重建生成样本的概率分布的问题。我们说这个问题对于一大批标准损失(L_p$、Hellinger、全部变异等)来说已经退化了。我们关注的是瓦西斯坦的损失,我们为此根据内核密度估计建立了一个估计值,其趋同率取决于美元和底密度的规律值$s\leq k-1$,而不是环境维度。我们特别表明,当多元美元为美元为美元立方块时,估计瓦西斯坦损失的体积计量值($M美元)与估算值有关的问题是相关的,为此将展示一个小型估量值。