Molecular mechanics (MM) force fields -- the models that characterize the energy landscape of molecular systems via simple pairwise and polynomial terms -- have traditionally relied on human expert-curated, inflexible, and poorly extensible discrete chemical parameter assignment rules, namely atom or valence types. Recently, there has been significant interest in using graph neural networks to replace this process, while enabling the parametrization scheme to be learned in an end-to-end differentiable manner directly from quantum chemical calculations or condensed-phase data. In this paper, we extend the Espaloma end-to-end differentiable force field construction approach by incorporating both energy and force fitting directly to quantum chemical data into the training process. Building on the OpenMM SPICE dataset, we curate a dataset containing chemical spaces highly relevant to the broad interest of biomolecular modeling, covering small molecules, proteins, and RNA. The resulting force field, espaloma 0.3.0, self-consistently parametrizes these diverse biomolecular species, accurately predicts quantum chemical energies and forces, and maintains stable quantum chemical energy-minimized geometries. Surprisingly, this simple approach produces highly accurate protein-ligand binding free energies when self-consistently parametrizing protein and ligand. This approach -- capable of fitting new force fields to large quantum chemical datasets in one GPU-day -- shows significant promise as a path forward for building systematically more accurate force fields that can be easily extended to new chemical domains of interest.
翻译:暂无翻译