项目名称: 数据分析中的持久同调方法研究
项目编号: No.61471409
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 无线电电子学、电信技术
项目作者: 夏省祥
作者单位: 山东建筑大学
项目金额: 55万元
中文摘要: 现代科学和工程正以空前的速度产生带有噪音的高维海量数据,传统的数学对这样的数据分析几乎无任何意义。本项目将研究数据分析中刚诞生的持久同调理论和算法,特别是,基于小样本数据的高维系统重构理论和算法。以图像数据为例,研究图像数据空间的局部特性,首先提取图像的高对比度小块构成样本数据空间S,利用最近邻居求得S的不同的核心子集。其次基于S的核心子集,利用lazy witness-复形构建合理的组合结构K (高维拓扑空间),得到S的基本空间X的近似表示。利用持久同调及JavaPlex软件编程计算K的条码,用这些条码消除样本数据的噪音,获得K的特性,同时获得S的核心子集的几何结构,特别地,在S中找到与Klein瓶具有相同拓扑的最大子空间,并由此恢复基本空间X的整体特性。项目的完成,将有效解决图像压缩领域中的一些前沿问题,改进图像的压缩技术,使持久同调成为有噪音的、高维、非齐次样本数据的强有力分析工具。
中文关键词: 持久同调;lazy;witness复形;图像压缩;数据分析;组合结构
英文摘要: Modern science and engineering is producing noisy high dimensional huge data at an unprecedented rate, the traditional mathematics is almost useless for such data analysis. The theory and algorithms of persistent homology, that had just been born in data analysis, are studied systematically in this project, especially the theory and algorithms of the high-dimensional system reconstruction based on small sampled data. Taking data of image analysis as an example, We research local features of image data spaces. Firstly, the sampled data S is obtained by extracting and processing high-contrast patches from images, some core subsets of S are found by using nearest neighbors. Secondly, a reasonable combinatorial structure K (a higher dimensional topological space) is constructed by using lazy witness complex based on S, thus an approximate representation of the underlying space X of the sampled data S is found. According to persistent homology, the barcodes of K are calculated by using JavaPlex software and a program, they can be used to remove noise in sampled data, thus the features of K are obtained, at same time the geometric structures of core subsets of S are obtained by using persistent homology, particularly, the largest subspace of S that has the homology of the Klein bottle is found, hence the global features of the underlying space X are recovered. Some frontier problems in the field of image compression are effectively solved, image compression technology is improved after completion of the project, that enables persistent homology to become a powerful analysis tool for noisy, high-dimensional, inhomogeneous sampled data.
英文关键词: persistent homology;lazy witness complex;image compression;data analysis;combinatorial structure