Neural network (NN) potentials promise highly accurate molecular dynamics (MD) simulations within the computational complexity of classical MD force fields. However, when applied outside their training domain, NN potential predictions can be inaccurate, increasing the need for Uncertainty Quantification (UQ). Bayesian modeling provides the mathematical framework for UQ, but classical Bayesian methods based on Markov chain Monte Carlo (MCMC) are computationally intractable for NN potentials. By training graph NN potentials for coarse-grained systems of liquid water and alanine dipeptide, we demonstrate here that scalable Bayesian UQ via stochastic gradient MCMC (SG-MCMC) yields reliable uncertainty estimates for MD observables. We show that cold posteriors can reduce the required training data size and that for reliable UQ, multiple Markov chains are needed. Additionally, we find that SG-MCMC and the Deep Ensemble method achieve comparable results, despite shorter training and less hyperparameter tuning of the latter. We show that both methods can capture aleatoric and epistemic uncertainty reliably, but not systematic uncertainty, which needs to be minimized by adequate modeling to obtain accurate credible intervals for MD observables. Our results represent a step towards accurate UQ that is of vital importance for trustworthy NN potential-based MD simulations required for decision-making in practice.
翻译:神经网络(NN)有可能在传统MD部队场域的计算复杂度范围内进行高度精确的分子动态模拟(MD),但是,如果在培训领域之外应用,NN的潜在预测可能不准确,从而增加不确定量(UQ)的需求。 巴伊西亚模型为UQ提供了数学框架,但基于Markov链的Monte Carlo(MC)的古典巴伊西亚方法对于NNC的潜力是难以计算。通过培训图表显示液态水和亚氨酸二硝酸二甲酸二甲酸的粗颗粒系统(MDMF),我们在这里表明,在培训范围外应用时,NNNC的潜力可能是不准确的,通过Stochchacartictical 梯度MC(SG-MCMMC),可以产生可靠的MD可观察到的不确定性估计值。我们表明,冷藏的后一种方法可以减少所需的培训数据规模,而对于可靠的UCMD链则需要多种。此外,我们认为,SG-MC和深团集方法可以取得可比的结果,尽管培训时间较短,但是对于后一种超度的步调。我们表明,在确定性轨道上需要一种最精确的精确的确定性结果。