项目名称: 基于信息融合的基因集富集度Meta分析方法研究
项目编号: No.61202273
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 计算机科学学科
项目作者: 张少宏
作者单位: 广州大学
项目金额: 24万元
中文摘要: 使用基因集富集度分析生物过程中的基因调控机制,是目前公认的标准生物信息学方法,而综合不同独立实验数据的Meta分析方法常被用来提高结果的可靠性。由于目前在干细胞源性心肌细胞和癌症干细胞等相关研究中,各独立研究数据集样本有限、共有基因稀少、平台多样化和数据重要程度不同等特点,传统方法难以奏效。针对上述问题,本课题提出基于信息融合的基因集富集度Meta分析方法,创新之处主要包括:(1)特征映射机制为基因集富集度Meta分析综合处理各独立研究数据提供统一平台基础,同时可将处理范围从传统方法的两类问题推广至多类问题及无监督问题;(2)基于非均匀聚类融合的样本相似性衡量方法,能有效分辨样本真实相似度与随机产生的伪相似度,为样本抽样提供计算依据。本课题提出的研究,为深入了解相关基因调控机制提供重要计算方法和实用信息技术。
中文关键词: 基因集富集度分析;数据 Meta 分析;信息融合;干细胞;
英文摘要: Gene set enrichment analysis is one of the most standard bioinformatics approaches to characterize the gene regulatory mechanisms with gene expression data. Moreover, integration of independent studies on the same topics, which is referred to as Meta analysis, is well recognized to improve reliability. However, for gene expression data of cancer stem cells and stem cell derived- cardiomyocytes, it is unpractical to combine various data from different studies to perform effective gene set enrichment analysis, due to the limited numbers of data samples, the rare common genes from various studies, the diversity of various platforms of different studies, and the different importance of data sets. In view of these problems, in this project, we propose to perform gene set enrichment Meta analysis methods based on information ensemble methods with the following novelties: (1) Feature mapping provides effective solution for gene set enrichment Meta analysis on different data and enables the proposed methods to handle multiple-class problems and unsupervised problems while traditional methods can only handle 2-class problems; (2) Data sample similarity measures based on heterogeneous cluster ensembles discriminate the real similarity from the false positive one generated by chances. Our study will provide both improved i
英文关键词: gene set enrichment analysis;data Meta analysis;information ensembles;stem cells;