传教遗传方案:保护地方地形学 (Genetic Programming for Manifold Learning: Preserving Local Topology)

Manifold learning methods are an invaluable tool in today's world of increasingly huge datasets. Manifold learning algorithms can discover a much lower-dimensional representation (embedding) of a high-dimensional dataset through non-linear transformations that preserve the most important structure of the original data. State-of-the-art manifold learning methods directly optimise an embedding without mapping between the original space and the discovered embedded space. This makes interpretability - a key requirement in exploratory data analysis - nearly impossible. Recently, genetic programming has emerged as a very promising approach to manifold learning by evolving functional mappings from the original space to an embedding. However, genetic programming-based manifold learning has struggled to match the performance of other approaches. In this work, we propose a new approach to using genetic programming for manifold learning, which preserves local topology. This is expected to significantly improve performance on tasks where local neighbourhood structure (topology) is paramount. We compare our proposed approach with various baseline manifold learning methods and find that it often outperforms other methods, including a clear improvement over previous genetic programming approaches. These results are particularly promising, given the potential interpretability and reusability of the evolved mappings.

翻译：曼字学习方法是当今世界日益庞大的数据集中的宝贵工具。曼字学习算法可以通过非线性变换发现高维数据集的低维代表(编组),这种变异保存了原始数据最重要的结构。最先进的多元学习方法直接优化了原始空间和已发现的内嵌空间之间的嵌入,而没有绘图。这使得可解释性(探索数据分析中的一项关键要求)几乎是不可能的。最近,基因编程作为一种非常有希望的方法,通过从原始空间到嵌入空间的功能绘图,在多方面学习中出现了非常有希望的方法。然而,基于基因编程的多元学习却很难与其他方法的性能相匹配。在这项工作中,我们提出了一种新的方法,即利用基因编程来进行多重学习,以保存当地的地貌学。预计这将大大改进本地邻居结构(地形学)至关重要的任务的绩效。我们将我们提出的方法与各种基线的多元学习方法进行比较,发现它往往超越其他方法,包括比以前的基因编程方法更明显改进。这些结果特别有希望,因为这些方法的可解释性和可演进性。

相关内容

流形学习

关注 345

流形学习，全称流形学习方法(Manifold Learning)，自2000年在著名的科学杂志《Science》被首次提出以来，已成为信息科学领域的研究热点。在理论和应用上，流形学习方法都具有重要的研究意义。假设数据是均匀采样于一个高维欧氏空间中的低维流形，流形学习就是从高维采样数据中恢复低维流形结构，即找到高维空间中的低维流形，并求出相应的嵌入映射，以实现维数约简或者数据可视化。它是从观测到的现象中去寻找事物的本质，找到产生数据的内在规律。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【图与几何深度学习，53页ppt】Graph and geometric deep learning

专知会员服务

90+阅读 · 2021年6月14日

【硬核书】Linux核心编程|Linux Kernel Programming，741页pdf

专知会员服务

80+阅读 · 2021年3月26日

【经典书】机器学习黑客秘笈(Machine Learning for Hackers)，322页pdf

专知会员服务

46+阅读 · 2021年2月8日