Electron cryo-microscopy (cryo-EM) produces three-dimensional (3D) maps of the electrostatic potential of biological macromolecules, including proteins. Along with knowledge about the imaged molecules, cryo-EM maps allow de novo atomic modelling, which is typically done through a laborious manual process. Taking inspiration from recent advances in machine learning applications to protein structure prediction, we propose a graph neural network (GNN) approach for automated model building of proteins in cryo-EM maps. The GNN acts on a graph with nodes assigned to individual amino acids and edges representing the protein chain. Combining information from the voxel-based cryo-EM data, the amino acid sequence data and prior knowledge about protein geometries, the GNN refines the geometry of the protein chain and classifies the amino acids for each of its nodes. Application to 28 test cases shows that our approach outperforms the state-of-the-art and approximates manual building for cryo-EM maps with resolutions better than 3.5 \r{A}.
翻译:电子冷冻-显微镜(cryo-EM)生成了三维(3D)生物大型分子(包括蛋白质)的静电潜力图,其中包括蛋白质。除对成像分子的了解外,冷冻-EM地图还允许进行新原子建模,这种建模通常是通过一个艰苦的人工过程完成的。我们从最近机器学习对蛋白结构预测应用的进展中得到的启发,提出了在冷冻-EM地图中自动建立蛋白模型的图形神经网络(GNN)方法。GNN在一张图上采取行动,配有用于代表蛋白链的个人氨基酸和边缘的节点。将基于浮质的冷冻-EM数据、氨基酸序列数据和蛋白色谱先前知识的信息结合起来,GNN对蛋白链的几何方法进行精细化,并对每个结点的氨酸进行分类。对28个测试案例的应用表明,我们的方法超过了分辨率大于3.5\\{A}用于冷冻-EM地图的状态和近手动建筑。