项目名称: 大数据中的多粒度知识发现模型与方法研究
项目编号: No.61309014
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 胡峰
作者单位: 重庆邮电大学
项目金额: 23万元
中文摘要: 数据驱动的科学研究已成世界科技发展的新浪潮,大数据智能分析与知识发现日益成为了当前的关键科学难题,而传统的数据挖掘方法难以对大数据进行有效处理。针对缺乏对动态、高维、复杂大数据的形式化描述模型,以及现有方法难以有效处理TB级以上的大数据等急需解决的理论和技术问题,本课题以大数据中的有效知识发现为研究目标,主要研究如下相关关键问题:根据人类从多个粒度层次进行知识的综合表达与处理的机制,建立复杂大数据的多粒度知识表示模型;提出大数据的降维与抽样方法,从数据层面对大数据进行简化,在简化数据模型上实现大数据的高效处理;提出复杂大数据知识发现的任务分解与求解方法,实现大数据中的渐进式知识发现。本课题的研究将有助于大数据的智能分析和知识发现,也有利于提高粒计算、粗糙集在大数据环境下的处理能力,推进大数据、粒计算等领域的研究和应用发展。
中文关键词: 粒计算;粗糙集;大数据;知识发现;
英文摘要: Data-driven research is attracting more attentions in the development of science and technology. The intelligent analysis and knowledge discovery for big data has become one of the key scientific problems. Two observations account for this point: (1) there is no formal method on dynamic, high dimensional and complex big data, (2) most popular methods are not suitable for processing the big data over TB level. For solving the above problems, the knowledge discovery methods for big data are proposed. The contents of this research project include: (1) Designing and constructing a knowledge description model with multi-granularity for big data, which can be used to describe the course that one recognize and discover the knowledge from original data. (2) Finding the data-driven methods for high dimensional feature selection and sampling, through which one can decrease the size of big data. (3) Designing and constructing the methods for decomposing and solving the complex tasks, which can be used to knowledge discovery and incremental learning of big data. This research project may help to impel the intelligent analysis and knowledge discovery of big data. Furthermore, it may also improve the capability of rough set theory and granular computing under big data environment, and promote the research and application of b
英文关键词: Granular computing;rough set;big data;knowledge discovery;