Despite the popularity of the manifold hypothesis, current manifold-learning methods do not support machine learning directly on the latent $d$-dimensional data manifold, as they primarily aim to perform dimensionality reduction into $\mathbb{R}^D$, losing key manifold features when the embedding dimension $D$ approaches $d$. On the other hand, methods that directly learn the latent manifold as a differentiable atlas have been relatively underexplored. In this paper, we aim to give a proof of concept of the effectiveness and potential of atlas-based methods. To this end, we implement a generic data structure to maintain a differentiable atlas that enables Riemannian optimization over the manifold. We complement this with an unsupervised heuristic that learns a differentiable atlas from point cloud data. We experimentally demonstrate that this approach has advantages in terms of efficiency and accuracy in selected settings. Moreover, in a supervised classification task over the Klein bottle and in RNA velocity analysis of hematopoietic data, we showcase the improved interpretability and robustness of our approach.
翻译:尽管流形假设广受欢迎,但当前的流形学习方法并不能直接在潜在的$d$维数据流形上支持机器学习,因为这些方法主要旨在将数据降维至$\mathbb{R}^D$空间,当嵌入维度$D$趋近于$d$时,会丢失关键的流形特征。另一方面,将潜在流形直接作为可微图册进行学习的方法尚未得到充分探索。本文旨在通过概念验证,展示基于图册方法的有效性和潜力。为此,我们实现了一种通用数据结构,用于维护可微图册,从而支持在流形上进行黎曼优化。我们进一步提出一种无监督启发式方法,能够从点云数据中学习可微图册。实验结果表明,该方法在特定场景下具有效率和精度优势。此外,在克莱因瓶上的监督分类任务以及造血数据的RNA速率分析中,我们展示了该方法在可解释性和鲁棒性方面的提升。