Molecular mechanics (MM) potentials have long been a workhorse of computational chemistry. Leveraging accuracy and speed, these functional forms find use in a wide variety of applications in biomolecular modeling and drug discovery, from rapid virtual screening to detailed free energy calculations. Traditionally, MM potentials have relied on human-curated, inflexible, and poorly extensible discrete chemical perception rules or applying parameters to small molecules or biopolymers, making it difficult to optimize both types and parameters to fit quantum chemical or physical property data. Here, we propose an alternative approach that uses graph neural networks to perceive chemical environments, producing continuous atom embeddings from which valence and nonbonded parameters can be predicted using invariance-preserving layers. Since all stages are built from smooth neural functions, the entire process is modular and end-to-end differentiable with respect to model parameters, allowing new force fields to be easily constructed, extended, and applied to arbitrary molecules. We show that this approach is not only sufficiently expressive to reproduce legacy atom types, but that it can learn to accurately reproduce and extend existing molecular mechanics force fields. Trained with arbitrary loss functions, it can construct entirely new force fields self-consistently applicable to both biopolymers and small molecules directly from quantum chemical calculations, with superior fidelity than traditional atom or parameter typing schemes. When trained on the same quantum chemical small molecule dataset used to parameterize the openff-1.2.0 small molecule force field augmented with a peptide dataset, the resulting espaloma model shows superior accuracy vis-\`a-vis experiments in computing relative alchemical free energy calculations for a popular benchmark set.
翻译:分子机械( MM) 潜力长期以来一直是计算化学的工马。 利用精确度和速度,这些功能形式在生物分子建模和药物发现中广泛应用,从快速虚拟筛选到详细的自由能源计算。 传统上, MM潜力依赖于人造精细、不灵活和不易伸缩的离散化学感知规则,或对小分子或生物聚合物应用参数,使得很难优化类型和参数以适应量子化学或物理属性数据。 在这里,我们建议了一种替代方法,即使用图形神经网络来观察化学环境,产生连续的原子嵌入,从中可以预测数值和非阳性参数的不断嵌入,而从中可以预测不易变的数值和非阳性参数。 由于所有阶段都是从光滑的神经功能建立起来的,整个过程是模块化和端到端的,使得新的能量字段容易构建、扩展和应用到任意的分子分子分子数据。 当我们证明这一方法不仅足够直观地复制原子型模型类型,而且能够从相对复制和直系的直径直径的直径直径的内径的内径的内径的内径的内径的内径计算, 的内径的内基计算, 的内径的内径的内基数据可以显示一个内径的内径的内径的内径的内基的内基的内径的内径的内径的内基的内基的内的内基的内径。 。