项目名称: 面向大数据的粒计算理论与方法
项目编号: No.U1435212
项目类型: 联合基金项目
立项/批准年度: 2015
项目学科: 其他
项目作者: 梁吉业
作者单位: 山西大学
项目金额: 150万元
中文摘要: 针对传统数据分析技术无法满足大数据可计算性、有效性以及时效性等要求的现状,探索大数据分析的粒计算新理论与新方法具有重要的理论意义与应用价值。本项目针对大数据的规模性、多模态性以及增长性三个特点,从数据粒化、多粒度模式发现与融合、多粒度/跨粒度推理与应用示范等方面着手开展深入系统的研究。具体内容包括:(1)探求大数据的多视角粒化机理,给出数据粒化的基本策略与算法;(2)研究多模态信息的知识表示、多粒度建模与决策机制;(3)运用不确定决策流图构建多粒度/跨粒度推理机制,并提出相应的推理算法;(4)以社会媒体数据、天文数据为载体,构建一套公共安全监控与预警系统,实现对公共安全事件发现、跟踪、分析与预警等,构建基于概念本体粒化和不确定流图的太阳活动预报模型。并搭建面向公共安全的大数据分析预警、空间天气预报两个仿真系统进行验证,为应对大数据挑战下的军民数据分析需求提供基础理论与技术手段。
中文关键词: 粒计算;大数据;知识表示与推理;数据建模;数据挖掘
英文摘要: Aiming at current situation of classical data analysis cannot the requirements including satisfying computability, validity and efficiency in the context of big data, exploring new theories and methods of granular computing for big data has important theoretical significance and application value. Aiming at the three characteristics of big data, i.e., scale, multi-modality and incremental, the project will carry out in-depth and systematic study from the data granulation, multi-granularity pattern discovery and integration, cross-granularity reasoning and application demonstration etc. Specifically, main contents of the project include: (1) Explore the multi-view granulation mechanism of big data, and give the basic strategies of data granulation and corresponding algorithms; (2) Study the knowledge representation, multi-granularity modeling and decision-making mechanisms of multi-modality information; (3) Build the reasoning mechanism of multi-granularity/ cross-granularity using uncertain decision-making flow chart, and propose the corresponding inference algorithms; (4) Based on the astronomical and social media data, develop two typical application demonstrations including solar activity forecast system and public monitoring and warning system. These research results have important theoretical value for large data analysis and mining, and provide basic theory and applicable techniques for data analysis in the context of big data.
英文关键词: Granular computing;Big data;Knowledge representation and reasoning;Data modeling;Data mining