选择蛋白质综合体结构模型的等级式、旋转式和等异性神经网络 (Hierarchical, rotation-equivariant neural networks to select structural models of protein complexes)

from arxiv, 11 pages, 5 figures + SI: Updated based on the published version in PROTEINS. Presented at NeurIPS 2019 workshop Learning Meaningful Representations of Life

Predicting the structure of multi-protein complexes is a grand challenge in biochemistry, with major implications for basic science and drug discovery. Computational structure prediction methods generally leverage pre-defined structural features to distinguish accurate structural models from less accurate ones. This raises the question of whether it is possible to learn characteristics of accurate models directly from atomic coordinates of protein complexes, with no prior assumptions. Here we introduce a machine learning method that learns directly from the 3D positions of all atoms to identify accurate models of protein complexes, without using any pre-computed physics-inspired or statistical terms. Our neural network architecture combines multiple ingredients that together enable end-to-end learning from molecular structures containing tens of thousands of atoms: a point-based representation of atoms, equivariance with respect to rotation and translation, local convolutions, and hierarchical subsampling operations. When used in combination with previously developed scoring functions, our network substantially improves the identification of accurate structural models among a large set of possible models. Our network can also be used to predict the accuracy of a given structural model in absolute terms. The architecture we present is readily applicable to other tasks involving learning on 3D structures of large atomic systems.

翻译：预测多蛋白综合体的结构是生物化学领域的一大挑战,对基础科学和药物发现具有重大影响。计算结构预测方法通常会利用预设的结构特征来区分准确的结构模型和不准确的结构模型。这就提出了一个问题,即是否可以直接从蛋白综合体原子坐标中直接学习准确模型的特征,而无需事先假设。这里我们引入了一种机器学习方法,直接从所有原子的三维位置直接学习蛋白综合体的准确模型,不使用任何事先计算过的物理激励或统计术语。我们的神经网络结构将多种要素结合起来,使包含数以万计原子的分子结构能够从分子结构中进行端到端学习:原子的点代表,在轮换和翻译、地方演进和等级子取样操作方面的差异。当我们与以前开发的评分功能结合使用时,我们的网络将大大改进在大量可能的模型中准确的结构模型的识别。我们的网络还可以用来预测一个特定的结构模型的准确性,在绝对值结构结构结构上,我们现有的结构可以随时学习其他系统。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

专知会员服务

36+阅读 · 2020年5月9日

【ICLR2020】面向层次重要性属性:神经序列模型的组成语义解释（Towards Hierarchical Importance Attribution:explaining compositional semantics for Neural Sequence Models）

专知会员服务

10+阅读 · 2019年12月24日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日