项目名称: 基于约束项的鲁棒模糊超原型聚类方法及其在生物医学数据分析中的应用
项目编号: No.61303182
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 刘晋
作者单位: 中国矿业大学
项目金额: 23万元
中文摘要: 模糊c均值聚类(FCM)算法已经在基因数据挖掘和核磁共振图像处理等领域取得了成功的应用,然而FCM仍有些固有缺陷,如,算法对噪声敏感,且局限于球状或超球状结构的数据集等。目前已有多种改进的FCM算法被提出,如基于约束项的鲁棒FCM算法及申请者前期研究中提出的结合超平面的模糊聚类算法等。本项目拟通过结合各类约束项与超平面数据分析方法,研究出一组新型鲁棒模糊超原型聚类算法,使得数据分析能在对噪声鲁棒的同时适应非球状数据集。项目的研究内容包括:① 结合现有的鲁棒FCM算法的各种约束项与模糊超原型聚类算法写出新型算法的目标函数;②根据新型目标函数和约束条件写出对应的拉格朗日乘子式并求解;③根据所得解对算法进行实现并在合成数据集与真实世界数据集进行验证并与以往算法对比;④ 将新型算法应用于微阵列基因表达数据分析和MR脑图像数据分析。通过本项目的研究,将为各领域自动数据分析提供有力可靠的计算工具。
中文关键词: FMC;生物医学数据分析;模糊聚类;;
英文摘要: Despite that Fuzzy c-means(FCM) clustering has been sucessfully applied in areas such as microarray gene expression data analysis and Magnetic Resonance Imaging(MRI) analysis, there still are some inherented disadvantages for FCM, i.e. the algorithm is sensitive to noise and only fits for data sets with spherical or hyper-spherical structures. Many modified algorithms have been put forward, i.e. robust FCM clustering with constraints and the novel fuzzy clustering combining with hyperplanes which is recently proposed by the applicant. In this project, information from constraint items are to be combined with hyperplane-based data analysis, and a group of novel robust fuzzy clustering algorithms which can handle noisy data sets with non-spherical sturcture will be studied. The aims of the project can be summarized as the follows: 1. the objective functions of the algorithms which combines constraint items with the fuzzy hyper-prototype clustering will be studied; 2. the Lagrangians of the objective functions under constraints will be formulated; 3. the novel algorithms will be derived and validated on both synthetic data sets and real world data sets and performances between existing methods will be compared; 4. the algorithms will then be applied to solve real world problems such as microarray gene expression da
英文关键词: FMC;bio-medical data analysis;fuzzy clustering;;