The human proteome contains a vast network of interacting kinases and substrates. Even though some kinases have proven to be immensely useful as therapeutic targets, a majority are still understudied. In this work, we present a novel knowledge graph representation learning approach to predict novel interaction partners for understudied kinases. Our approach uses a phosphoproteomic knowledge graph constructed by integrating data from iPTMnet, Protein Ontology, Gene Ontology and BioKG. The representation of kinases and substrates in this knowledge graph are learned by performing directed random walks on triples coupled with a modified SkipGram or CBOW model. These representations are then used as an input to a supervised classification model to predict novel interactions for understudied kinases. We also present a post-predictive analysis of the predicted interactions and an ablation study of the phosphoproteomic knowledge graph to gain an insight into the biology of the understudied kinases.
翻译:人类的蛋白质包含一个庞大的相互作用的动脉素和基质网络。 尽管有些动脉素已证明作为治疗目标非常有用, 但大部分人仍然没有得到充分研究。 在这项工作中, 我们提出了一个新的知识图形代表学习方法, 用于预测研究不足的动脉素的新型互动伙伴。 我们的方法使用一种磷蛋白质学知识图, 该图集了来自 iPTMnet、 Protein Ontology、 Genean Ontology 和 BioKG 的数据。 该知识图中的动脉素和基质的表示通过在三重力上进行定向随机行走, 以及一个修改过的GopGram 或 CBOW 模型来学习。 这些表示方法随后被用作一个监督分类模型的投入, 用于预测研究不足的动脉素的新相互作用。 我们还对预测的相互作用进行了预测后预测性分析, 并对磷蛋白质学知识图进行了消化研究, 以深入了解被研究的群系的生物学。