项目名称: 基于多源信息融合的蛋白质亚细胞定位预测算法研究
项目编号: No.61272312
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 姚玉华
作者单位: 浙江理工大学
项目金额: 80万元
中文摘要: 蛋白质亚细胞定位预测对蛋白质的功能、相互作用及调控机制的研究具有重要意义。本项目针对定位预测中信息的提取、挑选及融合等问题展开,重点研究如何充分提取蛋白质序列及结构信息,挑选核心信息,寻找预测策略的有效建模方法。主要内容包括:综合利用多元统计、LemZiv算法及计算几何方法,有效地提取包含蛋白质保守信息、复杂度信息、组分信息和全局信息的多重信息;在蛋白质结构信息获取中引入多变量的思想,寻找有效的统计算法,获取全面的蛋白质结构信息;针对多源信息的特点,结合随机森林方法,挑选蛋白质的核心信息;基于模糊神经网络的分类策略研究,更好地融合多个层次的特征,提高定位预测的准确率。本项目研究基于现有的测试数据展开,同时拥有充足、稳定的独立数据加以验证。本项目的研究成果将有助于蛋白质性质和功能的研究,可对蛋白质相互作用研究及新药物的开发提供借鉴和参考,还可为蛋白质的信息分析和应用算法设计提供新的思路。
中文关键词: 蛋白质亚细胞定位;信息获取;模糊神经网络;信息融合;
英文摘要: Protein subcellular localization prediction is important to study protein function, protein interaction and their regulation mechanism. This project aims at problems related to protein information extraction, selection and integration, and focuses on how to extract protein sequence and structure information, to select key information, search for the effective predict strategy. The main contents include: with comprehensive utilization of multivariate statistics, the LemZiv algorithm and computational geometry method, we effectively extract multiple information including protein conserved information, complexity information, component information and global information; As for protein structure information, we introduce the multi variables thought and design an effective statistical algorithm to obtain a comprehensive protein structure information; with the characteristics of multi-source information in mind, we select the protein core information with help of random forest method; Using fuzzy neural network classification strategy, we integrate the multiple features to improve the prediction accuracy. This research is based on the existing test data, and has abundant stable, independent data to validate. The research results of this project not only contribute to protein properties and functional studies, but al
英文关键词: Protein subcellular location;Information acquisition;Fuzzy neural network;Information fusion;