Spatial context is central to understanding health and disease. Yet reference protein interaction networks lack such contextualization, thereby limiting the study of where protein interactions likely occur in the human body. Contextualized protein interactions could better characterize genes with disease-specific interactions and elucidate diseases' manifestation in specific cell types. Here, we introduce AWARE, a graph neural message passing approach to inject cellular and tissue context into protein embeddings. AWARE optimizes for a multi-scale embedding space, whose structure reflects the topology of cell type specific networks. We construct a multi-scale network of the Human Cell Atlas and apply AWARE to learn protein, cell type, and tissue embeddings that uphold cell type and tissue hierarchies. We demonstrate AWARE on the novel task of predicting whether a gene is associated with a disease and where it most likely manifests in the human body. AWARE embeddings outperform global embeddings by at least 12.5%, highlighting the importance of contextual learners for protein networks.
翻译:参考蛋白质互动网络缺乏这种背景化,从而限制了对人体中可能发生蛋白质互动之处的研究。背景化蛋白质互动可以更好地描述与特定疾病相互作用的基因特征,并阐明特定细胞类型的疾病表现。在这里,我们引入了AWARRE,一个图形神经信息传递方法,将细胞和组织环境注入蛋白嵌入。AWARE优化了多尺度嵌入空间,其结构反映了细胞类型特定网络的地形。我们构建了一个人类细胞图集的多尺度网络,并应用AWARE学习蛋白质、细胞类型和组织嵌入系统,以维护细胞类型和组织等级。我们展示AWARE关于预测基因是否与疾病相关并最有可能在人体中显示的新任务。AWARE至少将12.5%的超常规嵌入全球嵌入率嵌入系统,强调背景学习者对蛋白质网络的重要性。