项目名称: 元基因组中复杂结构的序列模块寻找及其功能分析
项目编号: No.61472246
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 计算机科学学科
项目作者: 韦朝春
作者单位: 上海交通大学
项目金额: 80万元
中文摘要: 近年来发现的一系列具有复杂结构的序列模块,如CRISPR以及DGR等,在微生物的免疫及进化中具有重要功能。元基因组学通过直接测序的方法分析特定环境中群落的自然组成结构及功能等,是研究这些复杂结构序列模块的活性和潜在功能最佳的方法之一。目前已经有成千上万个元基因组的高通量测序数据,在这些海量元基因组数据集中准确寻找并分析这些复杂结构的序列模块需要一个高速分析方法和系统。本项目拟针对元基因组序列数据,通过引入基于统计模型的复杂结构序列模块预测模型,并结合现有元基因组序列拼接等分析方法和系统,开发寻找和分析复杂结构序列模块的自动系统。本系统将以寻找和分析不同元基因组中CRISPR以及DGR等序列模块为实例展示系统的应用,并通过读长较长的新一代测序技术验证在肠道菌元基因组中的预测结果。本项目的展开,将对元基因组学的进一步发展和应用提供比较重要的工具和资源,具有相当重要的理论意义和实际应用价值。
中文关键词: 元基因组学;高通量测序;序列模块;隐马尔科夫模型;DGR
英文摘要: Sequence modules, such as CRISPR and DGR, have complex sequence structures and play important roles in the immune and evolution of microbes. Metagenomics studies the composition and function of the microbial community by sequencing directly all genetic materials from an environment. It is one of the best methods for the study of the activeness and potential functions of those sequence modules with complex structures. Currently, there are thousands of metagenomes sequenced, and a fast analysis tool is in a great demand to find and analyse those sequence modules with complex structures in metagenomes. This project plans to creat a system to find and analyse sequence modules with complex structures in metagenomes. Statistics model based sequence module finding system will be integrated with metagenome sequence analysis methods such as metagenome assembly to create the system. CRISPR and DGR finding in metagenomes will be shown as examples of system applications, and new sequencing platforms with long read lengths will be applied to verify the prediction accuracy in gut metagenomes. This project will provide important tools and resources for a new generation of metagenomics.
英文关键词: metagenomics;high throughput seqeuncing;sequence module;hidden markov model;DGR