Manifold learning techniques for nonlinear dimension reduction assume that high-dimensional feature vectors lie on a low-dimensional manifold, then attempt to exploit manifold structure to obtain useful low-dimensional Euclidean representations of the data. Isomap, a seminal manifold learning technique, is an elegant synthesis of two simple ideas: the approximation of Riemannian distances with shortest path distances on a graph that localizes manifold structure, and the approximation of shortest path distances with Euclidean distances by multidimensional scaling. We revisit the rationale for Isomap, clarifying what Isomap does and what it does not. In particular, we explore the widespread perception that Isomap should only be used when the manifold is parametrized by a convex region of Euclidean space. We argue that this perception is based on an extremely narrow interpretation of manifold learning as parametrization recovery, and we submit that Isomap is better understood as constructing Euclidean representations of geodesic structure. We reconsider a well-known example that was previously interpreted as evidence of Isomap's limitations, and we re-examine the original analysis of Isomap's convergence properties, concluding that convexity is not required for shortest path distances to converge to Riemannian distances.
翻译:用于非线性减少的非线性尺寸的人工学习技术假定,高维地物矢量存在于一个低维多元体上,然后试图利用多维结构来获取数据有用的低维欧几里德表象。Isomap是一种精细的多元学习技术,它是一个优雅的合成,它包含两个简单的想法:里曼尼人的距离近似,其路径距离最短,以图将多重结构本地化,以及以多维尺度将伊索马普的距离与欧几里德人的距离相近。我们重新审视了伊索马普的理由,澄清了伊索马普的作为和不作为的证据。特别是,我们探索了人们广泛的看法,即只有当该元体被欧几里德空间的凝固区域合成时,才应该使用伊索马普。我们说,这种认识的基础是对多种学习的极狭窄的解释,将之解释为对地貌结构结构的欧几里德表现。我们重新考虑了一个众所周知的例子,以前被解释为是伊索马普限制的证据,我们重新审视了原始的距离,而我们并不需要完成Ispmas的趋同的距离分析。