项目名称: 下一代测序数据中的多重检验问题研究
项目编号: No.11301554
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 数理科学和化学
项目作者: 尤娜
作者单位: 中山大学
项目金额: 22万元
中文摘要: 基因突变分析是揭示复杂疾病与基因变异之间关系的重要途径。下一代测序技术的出现为全基因组范围内的基因突变扫描提供了技术支持,但这一技术在提高速度、降低成本的同时也带来了海量数据分析的问题,如何准确、高效地解读海量数据所承载的生物学信息是后基因时代所面临的重要问题。现阶段,运用下一代测序数据进行的突变分析基本采用贝叶斯模型的方法,其结果通常存在假阳性个数过多的问题。在微阵列数据下,扫描突变探针的问题可以在多重假设检验下进行,微阵列数据分析的研究进展促进了多重假设检验方法的蓬勃发展,但这些算法不能被直接应用于下一代测序数据分析中。本项目将以发展针对下一代测序数据的多重检验FWER/FDR控制方法为目的,开发针对下一代测序数据的突变分析算法和软件,在保持检测效率的同时从基因组水平上控制假阳性的发生,提高突变分析的准确度。此外,本项目还将开发并行运算算法,对计算机程序进行加速,增加基础研究的实用性。
中文关键词: 下一代测序;突变位点分析;多重假设检验;FDR控制;R程序包
英文摘要: Gene mutation analysis plays an important role in the genomics research to reveal the relationship between the complex diseases and gene structural variations. The development of next generation sequencing technology makes the genome-wide mutation screening being possible, which not only greatly improves the sequencing speed, but also decreases the sequencing cost. On the other hand, from this plotform, the ultra high-dimensional data are produced. How to accurately and efficiently imply the biological meanings from these high-dimensional data is a main problem we are facing in this post-genomic era. Nowadays, most of the mutation analysis methods for next generation data are based on the Bayesian model, whose results usually include a lot of false positives. In the microarray data analysis, the single feature polymorphism detection could be carried out under the multiple testing framework, and many FDR control methods were proposed for this purpose. However, these methods could not be applied to the next generation sequencing analysis directly. In the study, we will develop the FWER/FDR control methods for the multiple testing procedure, and propose a mutation analysis tool for the next generation data, to control the number of false positives in the genome-wide level and increase the accuracy of mutation analy
英文关键词: next generation sequencing;SNP detection;multiple testing;FDR control;R package