We present two machine learning methodologies which are capable of predicting diffusion Monte Carlo (DMC) energies with small datasets ($\approx$60 DMC calculations in total). The first uses voxel deep neural networks (VDNNs) to predict DMC energy densities using Kohn-Sham density functional theory (DFT) electron densities as input. The second uses kernel ridge regression (KRR) to predict atomic contributions to the DMC total energy using atomic environment vectors as input (we used atom centred symmetry functions, atomic environment vectors from the ANI models, and smooth overlap of atomic positions). We first compare the methodologies on pristine graphene lattices, where we find the KRR methodology performs best in comparison to gradient boosted decision trees, random forest, gaussian process regression, and multilayer perceptrons. In addition, KRR outperforms VDNNs by an order of magnitude. Afterwards, we study the generalizability of KRR to predict the energy barrier associated with a Stone-Wales defect. Lastly, we move from 2D to 3D materials and use KRR to predict total energies of liquid water. In all cases, we find that the KRR models are more accurate than Kohn-Sham DFT and all mean absolute errors are less than chemical accuracy.
翻译:我们提出了两种机器学习方法,能够用小数据集预测蒙特卡洛(DMC)能量的传播情况(共计算60DMC美元)。首先,我们使用Voxel深神经网络(VDNNS)来预测DMC能量密度,使用Kohn-Sham密度功能性理论(DFT)电子密度作为输入。第二,使用内核脊回归(KRR)作为投入,用原子环境矢量(我们使用了原子中心对称功能、ANI模型原子环境矢量和原子位置平稳重叠)来预测DMC总能量的原子贡献。我们首先比较了Pristine 石墨线状神经网络(VDNNNS)的方法,我们发现KRR方法在与梯度加速决策树、随机森林、Gawussian进程回归和多层透视镜等相比,效果最佳。此外,KRR(KR)在预测与石箱总缺陷相关的能源屏障时,我们发现KRR的精确度比RF的绝对值要低。