项目名称: 大数据环境下基于群体协同智能聚类的关键技术研究
项目编号: No.61472049
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 其他
项目作者: 韩旭明
作者单位: 长春工业大学
项目金额: 80万元
中文摘要: 综合考虑大数据环境下海量数据的分析和处理需求,从理论和实验验证两方面进行深入研究。以影响聚类性能的关键因素:聚(簇)类中心、距离度量方法、相似度算子、聚类时间复杂度等作为研究重点,并结合近年不断涌现的新型群体智能算法,对其进行理论创新和改进。主要内容包括:(1)超大规模数据随机选取与抽样数据聚类稳定性研究;(2)多群体协同智能聚(簇)类中心研究;(3)相似度度量方法研究;(4)多群体协同智能进化策略研究。通过理论改进与创新,实现对构建高效聚类算法关键技术的有效解决方案;(5)在此基础上,提出阶段群体协同智能聚类算法。通过群体协同智能算法快速搜索,确定和初始化聚(簇)类中心;通过多群体协同智能进化策略,实现高效的簇内数据分布式聚类;(6)最终形成一个系统的、大数据环境下多群体协同智能聚类模型。丰富与发展基于海量数据的数据挖掘理论与算法,对智能理论研究及其在数据挖掘领域聚类研究具有重要意义。
中文关键词: 大数据;群体智能;协同进化;聚类;优化
英文摘要: With comprehensive consideration of the analysis and processing of massive data in big data environment, thorough theoretical and experimental researches were done in this study in search of new clustering methods. The researches focused on the key factors influencing clustering performance, such as cluster center, distance measure method, clustering number, similarity operator, time complexity etc. Theoretical innovation and optimization were performed by using the new swarm intelligence algorithms emerging in recent years. Its main content including: (1) The study on clustering stability of random selection and sample for super-scale data. (2) The center study of collaborative multi-swarm intelligent clustering. (3) The study of similarity measurement. (4) The strategy study of collaborative multi-swarm intelligent evolution. Through theoretical improvement and innovation, we implemented an effective solution to tackle with the key technology in building effective clustering algorithms. (5) Based on this, we also propose the clustering algorithm of collaborative multi-swarm intelligent with stage. Though the swift searching using collaborative multi-swarm intelligent clustering algorithm, we are able to determine and initialize the clustering centers; and through the collaborative multi-swarm intelligent evolution strategy, we are able to realize the distributed inner-clustering clustering.(6) Eventually, a systematic cooperative swarm intelligence clustering model that runs in big data was established. This study can enrich and expand the massive data -based theories and algorithms of data mining, and it is of practical meanings to the theoretical researches on intelligence and their clustering in the data mining field.
英文关键词: Big data;swarm intelligence;cooperative evolution;clustering;optimization