Protein subcellular localization is an important factor in normal cellular processes and disease. While many protein localization resources treat it as static, protein localization is dynamic and heavily influenced by biological context. Biological pathways are graphs that represent a specific biological context and can be inferred from large-scale data. We develop graph algorithms to predict the localization of all interactions in a biological pathway as an edge-labeling task. We compare a variety of models including graph neural networks, probabilistic graphical models, and discriminative classifiers for predicting localization annotations from curated pathway databases. We also perform a case study where we construct biological pathways and predict localizations of human fibroblasts undergoing viral infection. Pathway localization prediction is a promising approach for integrating publicly available localization data into the analysis of large-scale biological data.
翻译:蛋白质子细胞本地化是正常细胞过程和疾病的一个重要因素。 虽然许多蛋白质本地化资源将蛋白质本地化资源作为静态处理,但蛋白本地化是动态的,并受到生物背景的严重影响。生物路径是代表特定生物背景的图表,可以从大规模数据中推断出来。我们开发了图表算法,以预测生物路径中所有相互作用的本地化,作为边际标签任务。我们比较了各种模型,包括图形神经网络、概率图形模型以及用于预测集成路径数据库本地化说明的有区别的分类器。我们还进行了一项案例研究,我们在那里建造了生物路径,并预测了正在经历病毒感染的人类纤维肿的本地化。途径本地化预测是将公开可得的本地化数据纳入大规模生物数据分析的一个很有希望的方法。