人类复杂疾病连锁不平衡基因定位的统计方法研究

项目名称： 人类复杂疾病连锁不平衡基因定位的统计方法研究

项目编号： No.11301206

项目类型： 青年科学基金项目

立项/批准年度： 2014

项目学科： 数理科学和化学

项目作者： 李玉梅

作者单位： 怀化学院

项目金额： 23万元

中文摘要： 随着基因分型技术的不断发展，利用多个遗传标记的单体型或基因型数据的连锁不平衡定位研究或关联研究成为鉴定人类复杂疾病基因的主要方法之一。然而，目前鉴定人类复杂疾病基因的统计方法存在几点不足：一是基于单体型的统计方法需要已知单体型的频率，当仅具有基因型数据时需要估计单体型频率；二是大部分现有的基于连锁不平衡指数的统计方法利用群体样本，要求群体随机交配即存在哈迪-温伯格平衡，它可能受到群体混杂/分层的影响。三是在进行统计分析时通常潜在的假定遗传数据没有错误，而即使很小的遗传数据误差即基因型错误都会导致分析结果的偏差。本研究将针对第一点不足构建基于基因型数据的统计量，针对第二点不足构建基于复合连锁不平衡指数的统计量和不受群体混杂影响的统计量，针对第三点不足构建不受基因型错误影响的统计量。我们的工作考虑了遗传分析中出现的几个实际问题，研究的成果将对如何更有效更准确地鉴定人类复杂疾病基因提供帮助。

中文关键词： 测序技术；关联分析；统计方法；常见变异；稀有变异

英文摘要： With the development of genotyping technology, the linkage-disequlibrium(LD) mapping or association analysis with data of multiple markers haplotypes or genotypes becomes one of major statistical methods for identifying disease genes of human being. However, there exist several challenges for current statistical methods. The first is that, when haplotypes are not directly observed and only genotype data at multiple markers are collected, it needs to estimate haplotype phases and frequencies. The second is that most LD statistical methods based on population-level are developed with the assumption of Hardy-Weinberg equilibrium (HWE), which might be affected by population admixture. The third is that it is almost assumed in those analyses that genetics data are without errors; however, even very small genotype error rate can bias the results of genetic analysis. Therefore, the purpose of our research is first to develop genotype-based statistic, second to develop statistic which is based on composite LD and is robust to population admixture, and third to develop statistic which is not affected by genotype errors. Due to considering several practical problems in genetic analysis, our researcher will be helpful for how to effectively and accurately identify disease genes of human being.

英文关键词： sequencing technology；association analysis；statistical method；common variant；rare variant

成为VIP会员查看完整内容