项目名称: 具有簇间分离特性的簇中心平面和子空间聚类方法研究
项目编号: No.11501310
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 数理科学和化学
项目作者: 王震
作者单位: 内蒙古大学
项目金额: 18万元
中文摘要: k均值(kmeans)方法是基于簇中心点聚类方法的代表,其簇中心在聚类过程中起着至关重要的作用。目前,对k均值方法的研究也日渐广泛和深入。本项目研究基于簇中心的聚类方法,将簇中心的概念由点扩展到面和线性子空间,研究基于簇中心平面、簇中心子空间的聚类方法。具体地,首先借鉴有监督非平行超平面支持向量机的理论和模型,在聚类时引入簇间分离概念,构造具有强弱分离特点的簇中心平面和子空间聚类模型,并借助非凸优化理论提出可行求解算法。其次,引入局部化概念,进一步扩展簇中心形状的多样性,研究基于簇中心局部化的平面片和有界子空间的聚类方法。在此基础上,借助核方法将簇中心由线性拓展到非线性情形。再次,通过研究并提出不同簇中心概念下的无监督评价准则,研究混合簇中心的聚类方法框架,提出基于两种及以上不同簇中心形状混合的聚类方法。最后,将基于各种簇中心及其混合模式的聚类方法应用于图像分割和图形部件匹配等实际问题中。
中文关键词: 数据挖掘;聚类方法;划分聚类法;约束非线性规划;非凸规划
英文摘要: The representative of the point-based clustering is kmeans, and the concept of cluster center plays an important role in the clustering process. At present, the study on kmeans is increasingly wide and meticulous. We study the clustering method based on the cluster center, and extend the definition of the cluster center from point to plane and flat to research the clustering methods on the plane-based and flat-based cluster centers. In detail, we first construct the plane-based and flat-based clustering models with strong or weak separation between-cluster scatter by the theory of non-parallel planes support vector machines. In order to obtain the feasible algorithms of these models, the non-convex optimization theory is utilized. Secondly, based on extending the shape of the cluster center by the localization concept, we study the localized planar patch-based and bounded flat-based clustering methods. On this basis, these cluster centers are extended to nonlinear case by the kernel trick to study the surface-based and manifold-based clustering. Once more, the mixed cluster centers clustering framework will be researched through the study on the unsupervised criterions for different shape of cluster centers, and the clustering method with two types of cluster centers or more will be proposed. Finally, these novel clustering methods will be applied for the real world application problems such as image segmentation and graphic component matching.
英文关键词: Data mining;Clutering;Partition clustering;Constrained nonlinear programming;Non-convex programming