项目名称: 基于下一代测序技术的重复基因结构及拷贝数目变异与癌症关联性研究
项目编号: No.61501392
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 无线电电子学、电信技术
项目作者: 连帅彬
作者单位: 信阳师范学院
项目金额: 19万元
中文摘要: 研究设计精确的重复基因组装算法与拷贝数目变异(CNVs)检测算法依然是生物信息学亟待解决的科学问题,并且与癌症之间的关联性研究尚处于初级阶段。下一代测序技术的快速发展使得可以在全基因组范围内检测重复基因序列及CNVs并分析其与癌症之间的关联性。基于此,本项目拟采用多学科方法整合分析下一代生物医学测序数据。主要研究内容:1)在基因组装过程中,拟采用信号滤波等方法消除测序数据中的覆盖偏差,提高组装重复基因序列的置信度和正确率;2)拟整合多种深度估计方法(贝叶斯估计、最大似然估计)等,提高小CNVs(<500bp)的断点检测正确率与拷贝数估计精度;3)拟结合基因表达量信息,构建变异基因与表达量之间网络结构,进行聚类分析,引入隐马尔科夫模型,计算状态转移概率,统计分析变异基因与癌变的关联性,并进行生物实验加以验证。本项目研究为进一步研究致病机理、药物开发等诸多惠及人类自身生理与健康等问题奠定基础。
中文关键词: 下一代基因测序技术;基因组装;拷贝数目变异检测;重复基因序列查找;关联性研究
英文摘要: It is urgent to research and design the accurate algorithms of assembling repetitive genomes and detecting copy number variations (CNVs) for bioinformatics, what’s more, the associations with cancers are still primary. Fortunately, due to the fast development of next generation sequencing technologies, it is possible to detect repetitive genomes and copy number variations and to analyze their associations with cancer in whole genome scale. Consequently, the combinations of multi-disciplinary approaches will be used in this project to integrate and analyze the biomedical sequencing data. The main research contents are including: 1) the methods likely signal filtering will be used in genome assembly process to filter out the sequencing biases and to improve the confidence of assembling repetitive genomes; 2) several methods of estimating depth will be combined to improve the correctness of detecting break locus and the accuracy of estimating copy numbers of small CNVs (<500bp); 3) the expression information and corresponding variant genomes will be integrated to build the dynamic network structures, and then to perform cluster analysis and compute state transition probability by using Hidden Markov Model(HMM), and then statistical test of the associations between cancers and variant genomes will be performed, finally biological experiments are used to verify the results. As a whole, the results presented in this project will establish a good foundation for further researches, such as pathogenic mechanism, drugs developments, and etc, which will benefit self-physiology and health of human beings.
英文关键词: Next Generation Sequencing Technologies(NGS);Genome Assembly;Copy Number Variation Detections;Repetitive Genome Finder;Association Study