Autoencoders are a widespread tool in machine learning to transform high-dimensional data into a lowerdimensional representation which still exhibits the essential characteristics of the input. The encoder provides an embedding from the input data manifold into a latent space which may then be used for further processing. For instance, learning interpolation on the manifold may be simplified via the new manifold representation in latent space. The efficiency of such further processing heavily depends on the regularity and structure of the embedding. In this article, the embedding into latent space is regularized via a loss function that promotes an as isometric and as flat embedding as possible. The required training data comprises pairs of nearby points on the input manifold together with their local distance and their local Frechet average. This regularity loss functional even allows to train the encoder on its own. The loss functional is computed via a Monte Carlo integration which is shown to be consistent with a geometric loss functional defined directly on the embedding map. Numerical tests are performed using image data that encodes different data manifolds. The results show that smooth manifold embeddings in latent space are obtained. These embeddings are regular enough such that interpolation between not too distant points on the manifold is well approximated by linear interpolation in latent space.
翻译:自动编码器是机器学习将高维数据转换为低维表达式的一个广泛工具,它仍然显示输入的基本特性。编码器将输入数据元中的相近点嵌入一个潜在的空间,然后可以用于进一步处理。例如,通过在潜在空间中的新的多元表示式,可以简化对元的内插。这种进一步处理的效率在很大程度上取决于嵌入的规律性和结构。在本条中,嵌入潜藏空间是通过一个损失函数而固定化的,该功能能促进尽可能的等量和平坦嵌入。所需的培训数据包括输入元的相近点配对,同时结合其本地距离和本地的Frechet平均值。这种定期损失功能甚至允许自己对编码器进行训练。通过蒙特卡洛集成计算损失功能,该功能与嵌入地图上直接界定的几何损失功能一致。Numicalical测试是通过一个图像数据数据数据来进行,该数据解码不同数据元和平坦嵌嵌入。结果显示,隐性空间中光化的嵌入层嵌入点与其本地距离和当地Frechetchetchetal 。这些嵌入功能是足够的正常的深层导。