Assume that we observe i.i.d.~points lying close to some unknown $d$-dimensional $\mathcal{C}^k$ submanifold $M$ in a possibly high-dimensional space. We study the problem of reconstructing the probability distribution generating the sample. After remarking that this problem is degenerate for a large class of standard losses ($L_p$, Hellinger, total variation, etc.), we focus on the Wasserstein loss, for which we build an estimator, based on kernel density estimation, whose rate of convergence depends on $d$ and the regularity $s\leq k-1$ of the underlying density, but not on the ambient dimension. In particular, we show that the estimator is minimax and matches previous rates in the literature in the case where the manifold $M$ is a $d$-dimensional cube. The related problem of the estimation of the volume measure of $M$ for the Wasserstein loss is also considered, for which a minimax estimator is exhibited.
翻译:假设我们观察了i. i.d. d. ~ ~ 点位于某个未知的维度 $mathcal{C ⁇ k$ Submany points 接近某个未知的维度 $mathcal{C ⁇ k$ submany money $$ 在可能的高维空间里,我们研究了重建生成样本的概率分布的问题。我们说这个问题对于一大批标准损失(L_ p$, Hellinger, 全部变异等)来说已经退化了,我们关注瓦森斯坦的损失,我们为此建立了一个根据内核密度估计值估算的估量器,其趋同率取决于$d和基础密度的规律值 $s\leq k-1$,而不是环境维度。我们特别表明,当当数兆美元为美元为瓦塞斯坦损失的立方体时,估计量量值$M$的量值的相关问题也得到了考虑,为此将展示一个小型的估量度。