项目名称: 聚类导向的字典学习及基于稀疏表示的高维数据聚类研究
项目编号: No.71271027
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 管理科学
项目作者: 武森
作者单位: 北京科技大学
项目金额: 54万元
中文摘要: 高维数据的聚类分析,是当前数据挖掘的研究热点和难点之一,在互联网知识发现及管理决策支持中有着广泛应用。而近年来稀疏表示相关理论在机器学习和模式识别中的成功应用为高维数据聚类研究提供了新的思路。本项目以完善稀疏表示理论及高维聚类方法为目标,研究聚类导向的字典学习方法,进而引入稀疏表示理论研究高维数据聚类相关问题。主要将研究以下内容: (1)探索稀疏表示理论及其在分类任务中成功应用的机理,构建不同类型高维数据的稀疏表示模型,研究以高维数据聚类任务导向的稀疏表示字典学习方法。 (2)基于稀疏表示求解结果研究不同类型高维缺失数据处理方法和高维数据相似性度量方法,为基于稀疏表示的高维聚类算法提供研究基础。 (3)从高维数据聚类有效性评价方法和高维聚类算法的角度,研究基于稀疏表示的高维数据聚类体系,以优化高维数据聚类挖掘效果。
中文关键词: 高维数据;聚类分析;稀疏表示;字典学习;缺失数据填补
英文摘要: High dimensional data cluster analysis is one of the hot and difficult topics in data mining nowadays, and has a wide range of applications in Internet knowledge discovery and management decision support. While sparse representation related theory, which has been applied in machine learning and pattern recognition successfully in recent years, provides fresh thought to the research on high dimensional data clustering. This proposal, aiming to improve the sparse representation theory and high dimensional data clustering method, is planned to research on clustering-oriented dictionary learning and introduce sparse representation theory to the related problem study of high dimensional data clustering. The main contents of the research are as follow: (1) Exploration for mechanism of sparse representation theory and its successful application in classification task; establishment of sparse representation model with different types of high dimensional data and research on the high dimensional data clustering task-oriented sparse dictionary learning algorithm. (2) Research on missing value imputation and similarity measurement of high dimensional data with different types based on the solution results of sparse representation, to provide a basis for the high dimensional data clustering algorithm via sparse representati
英文关键词: High dimensional data;Cluster analysis;Sparse representation;Dictionary learning;Missing value imputation