We study representations of data from an arbitrary metric space $\mathcal{X}$ in the space of univariate Gaussian mixtures with a transport metric (Delon and Desolneux 2020). We derive embedding guarantees for feature maps implemented by small neural networks called \emph{probabilistic transformers}. Our guarantees are of memorization type: we prove that a probabilistic transformer of depth about $n\log(n)$ and width about $n^2$ can bi-H\"{o}lder embed any $n$-point dataset from $\mathcal{X}$ with low metric distortion, thus avoiding the curse of dimensionality. We further derive probabilistic bi-Lipschitz guarantees which trade off the amount of distortion and the probability that a randomly chosen pair of points embeds with that distortion. If $\mathcal{X}$'s geometry is sufficiently regular, we obtain stronger, bi-Lipschitz guarantees for all points in the dataset. As applications we derive neural embedding guarantees for datasets from Riemannian manifolds, metric trees, and certain types of combinatorial graphs.
翻译:我们从一个任意的测量空间 $mathcal{X} 中用运输量(Delon 和 Desolneux 2020年) 的单倍高斯混合物空间中,我们研究数据的表示方式。我们从名为 emph{ 概率变压器的小神经网络所执行的地貌图的嵌入保证。我们的保证是记忆型的:我们证明,大约 $n=log(n) 美元和宽度($n) 的概率变压器,大约 $_2美元和宽度($n_2$), {b-H\"{o} 焊derder 嵌入任何美元点数据集,从$\ mathcal{X} 低度变形,从而避免了维度的诅咒。我们进一步从一些应用中推算出概率性双利普西茨保证,用以抵消扭曲的数量以及随机选择的一对点嵌入与扭曲相嵌入的概率。如果$\calcal{X} 美元测量度足够的所有点,我们得到了更强大的双利普西茨的保证。我们从里曼树中获取了某种恒定的恒定的恒定的恒定的恒定的恒定的基。