Molecular dynamics simulations are an invaluable tool in numerous scientific fields. However, the ubiquitous classical force fields cannot describe reactive systems, and quantum molecular dynamics are too computationally demanding to treat large systems or long timescales. Reactive force fields based on physics or machine learning can be used to bridge the gap in time and length scales, but these force fields require substantial effort to construct and are highly specific to a given chemical composition and application. A significant limitation of machine learning models is the use of element-specific features, leading to models that scale poorly with the number of elements. This work introduces the Gaussian multipole (GMP) featurization scheme that utilizes physically-relevant multipole expansions of the electron density around atoms to yield feature vectors that interpolate between element types and have a fixed dimension regardless of the number of elements present. We combine GMP with neural networks to directly compare it to the widely used Behler-Parinello symmetry functions for the MD17 dataset, revealing that it exhibits improved accuracy and computational efficiency. Further, we demonstrate that GMP-based models can achieve chemical accuracy for the QM9 dataset, and their accuracy remains reasonable even when extrapolating to new elements. Finally, we test GMP-based models for the Open Catalysis Project (OCP) dataset, revealing comparable performance to graph convolutional deep learning models. The results indicate that this featurization scheme fills a critical gap in the construction of efficient and transferable machine-learned force fields.
翻译:分子动态模拟是众多科学领域的宝贵工具。 然而,无处不在的古典力场无法描述反应系统,量子分子动态在计算上要求过高,无法处理大型系统或长期时间尺度。基于物理或机器学习的回动力场可以用来弥合时间和长度尺度的差距,但这些力场需要做出大量努力来构建,并且与特定化学构成和应用非常具体。机器学习模型的一个重大局限性是使用特定元素特征,导致与元素数量相比规模差的模型。这项工作引入了高斯多极(GMP)的增生计划,它使用与物理相关的多极系统或长的时间尺度处理。基于物理或机器学习的反射力场,可以产生元素类型间间和长度尺度间断的特性矢量。我们把GMP和神经网络结合起来,直接将它与广泛使用的Behrler-Parinello对等量性功能进行对比,从而显示它显示出更高的精确度和计算效率。此外,我们证明GP-P-P-S-S-S-S-S-SQ-S-SQ-S-SQ-SQ-Silental 精确度模型可以用来对等数据进行直径的精确性数据测试。