项目名称: 基于粒计算的大数据特征融合理论与方法
项目编号: No.61502104
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 其他
项目作者: 王石平
作者单位: 福州大学
项目金额: 20万元
中文摘要: 大数据本质上是对大量的半结构化、非结构化的高维多模态特征的处理。大数据是需要新处理模式才能具有更强决策力、洞察发现力和流程优化能力的海量、高增长率和多样化的信息资产。粒计算理论是数据挖掘的重要分支,为非结构化和半结构化数据提供了一系列信息粒化方式和求解方法,其研究在近年来取得了长足进展。本项目拟从数据模型、计算模型、问题和算法四个层次来系统地研究基于粒计算的大数据特征融合。主要研究内容包括:1)基于粒计算的多信道图像特征选取;2)基于结构化学习的多模态特征融合模型;3)基于多核学习的多模态特征融合模型;4)相应模型的矩阵批量迭代算法。通过这些研究内容的探索与创新,建立四个层次的理论体系,提出并解决其中的关键问题,开发具体问题的高性能算法,为实际应用提供高效率、低成本、低风险的数据挖掘方案。
中文关键词: 粒计算;机器学习;数据挖掘;大数据;特征选择
英文摘要: Big data is in essence a technology to process the high-dimensional and multi-modal features in semi-structured and/or unstructured data. Big data is a set of techniques and technologies that require new forms of integration to uncover large hidden values from large datasets that are diverse, complex, and of a massive scale. Granular computing is a branch of data mining and it provides a series of methods for information granulation and information transformation for semi-structured and/or unstructured data, thus this theory has attracted much more attentions in recent years. In this project, we plant to systematically study the feature fusion problems of big data based on granular computing from the following four levels: data model, computing model, problem level and algorithm level. The main research topics include: 1) Granular computing based multi-channel feature extraction of image data; 2) Multi-modal feature fusion based on structural learning; 3) Multi-modal feature fusion based on multi-kernel learning; 4) The matrix iterative algorithms of the above problems. Based on the exploration and innovation of the above topics, we plan to construct the theoretical system of four levels, define and solve the core problems in the four levels, develop problem-specific efficient algorithms and provide high-efficiency, low-cost and low-risk data mining methods.
英文关键词: granular computing;machine learning;data mining;big data;feature fusion