项目名称: 基于ChIP-seq数据和系统发生信息的调控模体预测
项目编号: No.61303084
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 刘丙强
作者单位: 山东大学
项目金额: 23万元
中文摘要: 转录过程是基因表达的第一步,而转录调控通常是通过转录因子结合与基因上游的调控模体结合来实现的。所以调控模体预测一直是生物信息学中的重要课题。近年来,快速增长的基因组数据为研究调控模体提供了新机遇,而基于新一代测序的ChIP-seq技术将调控模体研究带入了全基因组水平。本项目拟结合ChIP-seq数据和系统发生足迹来预测调控模体。立足于两种数据的互补性,我们从各自的数据处理中的问题入手,重点解决数据融合问题,实现模体预测性能的提高。主要工作体现在:(1)将系统发生足迹和ChIP-seq的蕴含的调控模体信息分别转化为保守度曲线和覆盖度曲线,进行有效的数据融合。(2)利用系统发生树、操纵子、序列联配等信息解决系统发生足迹中的问题。(3)设计ChIP-seq数据处理的方法,减少噪音和实验偏差的影响。(4)基于融合数据进行模体预测和优化。最终实现异源数据的互补,提高预测精度,并提供软件及数据库服务。
中文关键词: 调控模体;系统发生足迹;算法;调节子;染色体免疫共沉淀测序
英文摘要: Transcription is the most essential step of gene expression, and transcriptional regulation is accomplished through transcription factors binding with the cis regulatory motifs, located on the upstream of corresponding genes. Therefore, the prediction of cis regulatory motifs is one of the most important computational problems in the field of bioinformatics. Recently, the rapid growth of genome sequencing data has provided a new opportunity in this field, and high-throughput chromatin immune-precipitation followed by the next generation sequencing (ChIP-seq) has promoted the problem to a genome scale. This project intends to predict the cis regulatory motifs utilizing ChIP-seq data along with phylogenetic footprinting information. In order to improve the performance of traditional motif prediction algorithms, we integrate the two kinds of information based on the complementary property of them. The key contributions of this project include that (i) convert the cis regulatory motif signals, embedded in phylogenetic footprinting and ChIP-seq, into conservative curve and coverage curve, respectively; (ii) enrich the phylogenetic footprinting pipeline based on the phylogenetic tree, operon structure and multiple sequence alignment; (iii) design a ChIP-seq data-processing framework to reduce the influence of backgrou
英文关键词: Regulatory Motif;Phyelogenetic Footprinting;Algorithm;Regulon;ChIP-seq