项目名称: 大数据背景下面向操作模式的约简算法研究
项目编号: No.61502538
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 其他
项目作者: 尹林子
作者单位: 中南大学
项目金额: 20万元
中文摘要: 随着大数据计算的崛起,大数据约简算法研究对于约简理论发展及应用都具有重要的意义。然而,受数据倾斜问题以及随机性现象的干扰,传统约简算法的性能难以满足复杂应用的需求。本项目基于MapReduce以及操作模式这一典型应用,研究高效的大数据约简算法。包括:(1)快速约简算法。针对数据倾斜对约简效率的干扰问题,探索基于MapReduce排序技术的约简算法,并与三种传统约简算法进行对比,证明新算法在面临数据倾斜时的计算优势;(2)最佳约简算法。针对随机性现象造成的约简结果与应用不匹配问题,提出最佳约简概念,并结合操作模式优化对象,定义面向复杂工业过程的背景知识描述方法,设计不依赖于属性重要度的最佳约简算法;(3)研究一套基于大数据的、面向操作模式的约简定制方法。本研究不仅对大数据计算与数据约简的融合具有重要的意义,也为约简算法在复杂工业过程优化控制方法(尤其是操作模式)中的应用提供理论支持与实际案例
中文关键词: 属性约简;大数据下的粗糙集;操作模式
英文摘要: As the development of big data, it is very meaningful for Rough Set theory and the related applications to research the reduction algorithms based on big data. However, the existing algorithms are not so effective in complex applications because of data skew and randomicity. In this project, some efficient algorithms will be studied based on MapReduce and Operational-pattern. It includes: (1) Fast reduction algorithm. In order to reduce the influences of data skew, the reduction algorithm based on sort ways of MapReduce will be investigated, and the advantage on running time of the new algorithm will be proved by comparing with three traditional algorithms. (2) Best reduction algorithm. A novel notion of best reduct will be considered to resolve the conflict caused by randomicity, which makes the application requirements and reducts inconsistent. By combining the optimization objective of Operational-pattern, the description on background knowledge of complex industrial process will be studied and a best reduction algorithm will be proposed, which doesn’t use the traditional strategy of attribute significance. (3) A kind of big data based customized reduction way for Operational-pattern. The related researches are very important and key to the theory development of reduction based on big data. Furthermore, it also significant to enhance the efficiency of the optimal control methods on complex industrial processes,such as Operational-pattern.
英文关键词: attribute reduction;rough set based on big data;operational-pattern