Kernel ridge regression (KRR) that satisfies energy conservation is a popular approach for predicting forcefield and molecular potential, to overcome the computational bottleneck of molecular dynamics simulation. However, the computational complexity of KRR increases cubically as the product of the number of atoms and simulated configurations in the training sample, due to the inversion of a large covariance matrix, which limits its applications to the simulation of small molecules. Here, we introduce the atomized force field (AFF) model that requires much less computational costs to achieve the quantum-chemical level of accuracy for predicting atomic forces and potential energies. Through a data-driven partition on the covariance kernel matrix of the force field and an induced input estimation approach on potential energies, we dramatically reduce the computational complexity of the machine learning algorithm and maintain high accuracy in predictions. The efficient machine learning algorithm extends the limits of its applications on larger molecules under the same computational budget. Using the MD17 dataset and another simulated dataset on larger molecules, we demonstrate that the accuracy of the AFF emulator ranges from 0.01-0.1 kcal mol$^{-1}$ or energies and 0.001-0.2 kcal mol$^{-1}$ $\require{mediawiki-texvc}$$\AA^{-1}$ for atomic forces. Most importantly, the accuracy was achieved by less than 5 minutes of computational time for training the AFF emulator and for making predictions on held-out molecular configurations. Furthermore, our approach contains uncertainty assessment of predictions of atomic forces and potentials, useful for developing a sequential design over the chemical input space, with nearly no increase of computational costs.
翻译:满足节能的内核脊回归( KRRR) 符合节能是一个流行的方法,用于预测力场和分子潜力,以克服分子动态模拟的计算瓶颈。然而,由于培训样本中的原子数量和模拟配置的产物,因此KRR的计算复杂性随着培训样本中的原子数量和模拟配置的产物而增加。由于一个大型共变矩阵的反转,限制了其应用于模拟小分子的应用,因此,我们引入了原子化的力场模型,该模型需要大大降低计算成本,以达到预测原子力量和潜在能量的量化学精确度。我们通过对动力外核内核内核阵列矩阵进行数据分隔,我们大幅降低机器学习算法的计算复杂性,并保持高精确度。 高效的机器学习算法扩大了其在同一计算预算下对较大分子应用的极限。 使用MD17数据集和另一个模拟数据集,用不增加计算成本来达到原子力量和潜在能量的量化学值。 我们证明,AFF的精确度值值值的数值值值值值值值的数值值的数值值值值值值值值的数值值值值值值为0.01- 1 和数值的数值的数值的数值的数值的数值的数值的值值值值值值值值值值值值值的值的值值值值值的值的值的值的值的值的值值值值值值的值的值的值值的值的值的值值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值值值值值值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值