A high level of physical detail in a molecular model improves its ability to perform high accuracy simulations, but can also significantly affect its complexity and computational cost. In some situations, it is worthwhile to add additional complexity to a model to capture properties of interest; in others, additional complexity is unnecessary and can make simulations computationally infeasible. In this work we demonstrate the use of Bayes factors for molecular model selection, using Monte Carlo sampling techniques to evaluate the evidence for different levels of complexity in the two-centered Lennard-Jones + quadrupole (2CLJQ) fluid model. Examining three levels of nested model complexity, we demonstrate that the use of variable quadrupole and bond length parameters in this model framework is justified only sometimes. We also explore the effect of the Bayesian prior distribution on the Bayes factors, as well as ways to propose meaningful prior distributions. This Bayesian Markov Chain Monte Carlo (MCMC) process is enabled by the use of analytical surrogate models that accurately approximate the physical properties of interest. This work paves the way for further atomistic model selection work via Bayesian inference and surrogate modeling
翻译:分子模型中的高度物理细节提高了其进行高精度模拟的能力,但也会对其复杂性和计算成本产生重大影响。在某些情况下,值得为模型增加复杂性,以捕捉感兴趣的特性;在其他情况下,额外的复杂性是不必要的,可以使模拟在计算上不可行。在这项工作中,我们展示了分子模型选择中使用贝叶因因素的情况,利用蒙特卡洛取样技术评价两端中列纳尔德-琼斯+四极曲(2CLJQ)液体模型不同复杂程度的证据。我们研究了三个等级的嵌巢模型复杂程度。我们证明,在这个模型框架中使用可变四极和债券长度参数有时是有道理的。我们还探索了巴伊西亚先前在贝叶因素上的分布的影响,以及提出有意义的先前分布的方法。利用分析模型来准确估计实际利益特性,使Bayesian Markov 链 Monte Carlo(MC ) 进程得以实现。这项工作为通过Bayesgate and Proference进一步进行模拟模型选择工作铺平了道路。