Dimension reduction (DR) aims to learn low-dimensional representations of high-dimensional data with the preservation of essential information. In the context of manifold learning, we define that the representation after information-lossless DR preserves the topological and geometric properties of data manifolds formally, and propose a novel two-stage DR method, called invertible manifold learning (inv-ML) to bridge the gap between theoretical information-lossless and practical DR. The first stage includes a homeomorphic sparse coordinate transformation to learn low-dimensional representations without destroying topology and a local isometry constraint to preserve local geometry. In the second stage, a linear compression is implemented for the trade-off between the target dimension and the incurred information loss in excessive DR scenarios. Experiments are conducted on seven datasets with a neural network implementation of inv-ML, called i-ML-Enc. Empirically, i-ML-Enc achieves invertible DR in comparison with typical existing methods as well as reveals the characteristics of the learned manifolds. Through latent space interpolation on real-world datasets, we find that the reliability of tangent space approximated by the local neighborhood is the key to the success of manifold-based DR algorithms.
翻译:降低尺寸(DR)的目的是通过保存基本信息来学习高维数据的低维表达式(DR) 。在多方面学习的背景下,我们确定,在信息无损失的DR之后的表示式正式保留了数据元体的表层和几何特性,并提出一个新的两阶段的DR方法,称为不可逆的多元学习(inv-ML),以弥合理论信息无损和实际的DR之间的差距。第一阶段包括自成一体的坐标转换,以学习低维度的表示式,而不破坏地形学和对保存当地几何学的局部异度限制。在第二阶段,对目标维度和过度DR假设中发生的信息损失进行线性压缩。在7个数据集上进行了实验,实施了被称为i-ML-Enc的神经网络执行。 动态上,i-ML-Enc与典型的现有方法相比,实现了不可逆的DR,并揭示了所了解的方位特性。通过在现实世界数据集上的潜在空间内部内插图断,我们发现当地空间光流至区域的关键的MLMR成功率。