项目名称: 高维混合数据异常知识发现的粒计算模型关键问题研究
项目编号: No.11471001
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 数理科学和化学
项目作者: 邓廷权
作者单位: 哈尔滨工程大学
项目金额: 75万元
中文摘要: 异常知识发现是数据挖掘和知识发现领域中一项特殊且具有重要意义的研究主题.它不仅仅局限于早期的消除噪声数据,更多地发现和揭示隐藏于数据集中的有意义但与大部分数据差异明显的稀有模式.数据的异常性不是由数据的某些属性决定,而是由其所有属性协同决定.本项目结合粒计算理论与非线性特征提取技术,从多粒度和多特征子空间角度致力于高维混合数据局部异常知识刻画和发现的关键问题研究,探索数据的粒化聚类方法、有效的特征选择机制、以及多特征子空间集成学习模型和异常知识评判准则等一系列关键问题的解决。主要研究内容包括:(1)高维混合数据属性与对象间信息交流的粒化方法;(2)属性的重要度度量及知识的最小约简方法;(3)高维混合数据的非线性降维策略及异常知识的刻画;以及(4)异常子空间的建立与多特征子空间的集成群决策方法.
中文关键词: 不确定理论;粒计算;粗糙集;数据挖掘;特征提取
英文摘要: Outlier knowledge discovery is a specific but significant research topic in data mining and knowledge discovery. It is not only confined to finding and reducing noisy data, but is more applied to discovering and revealing cryptical, rare but significant patterns that are distinct from major data. The anomalism of data is not resolved by part of their attributes, but is determined by coordination of the whole attributes. This project is dedicated to researching on problems of local outliers knowledge characterization and discovery in hign dimensional hybrid data by incorporating the theory of granular computing and the techniques of nonlinear feature extraction from the viewpoints of multi-granulation and multi-feature subspaces. A series of crucial problems are probed and solved including methods of granulation clustering, principles of efficient feature selection and criteria on outliers based on integrated group making-decision methods of multi-feature subspaces. The main focuses of this project include (1) Granulation of high dimensional hybrid data based on information communication between attributes and objects; (2) Measures of attribute significance and methods of minimal reduction of knowledge;(3) Stragetgy of nonlinear dimensionality reduction and characterization on outlier knowledge;and (4) Establishment of outlier subspaces and integrated group making-decision methods of multi-feature subspaces.
英文关键词: uncertainty theory;granular computing;rough sets;data mining;feature extraction