Appropriately identifying and treating molecules and materials with significant multi-reference (MR) character is crucial for achieving high data fidelity in virtual high throughput screening (VHTS). Nevertheless, most VHTS is carried out with approximate density functional theory (DFT) using a single functional. Despite development of numerous MR diagnostics, the extent to which a single value of such a diagnostic indicates MR effect on chemical property prediction is not well established. We evaluate MR diagnostics of over 10,000 transition metal complexes (TMCs) and compare to those in organic molecules. We reveal that only some MR diagnostics are transferable across these materials spaces. By studying the influence of MR character on chemical properties (i.e., MR effect) that involves multiple potential energy surfaces (i.e., adiabatic spin splitting, $\Delta E_\mathrm{H-L}$, and ionization potential, IP), we observe that cancellation in MR effect outweighs accumulation. Differences in MR character are more important than the total degree of MR character in predicting MR effect in property prediction. Motivated by this observation, we build transfer learning models to directly predict CCSD(T)-level adiabatic $\Delta E_\mathrm{H-L}$ and IP from lower levels of theory. By combining these models with uncertainty quantification and multi-level modeling, we introduce a multi-pronged strategy that accelerates data acquisition by at least a factor of three while achieving chemical accuracy (i.e., 1 kcal/mol) for robust VHTS.
翻译:适当识别和处理具有重要多参考特性的分子和材料,对于在虚拟高吞吐量筛选中实现高数据忠诚至关重要。然而,大多数VHTS使用单一功能进行,使用大约密度功能理论(DFT)进行。尽管开发了无数MR诊断,但这种诊断的单一价值表明MR对化学财产预测的影响程度尚未确定。我们评估了10,000多个过渡金属复合体(TMCs)和与有机分子的模型相比较的MR诊断结果。我们发现,在这些材料空间中,只有部分MR诊断可转让。通过研究MM对化学特性(即MR效应)的影响,这些特性涉及多种潜在的能源表面(即,Adiabat性旋转分裂,$\Delta Eämathrm{H-L}}和电离子化潜力,我们发现,MRM(TM)的取消超过模型的累积。MRV性质差异比在预测财产预测中预测MR效应的总程度更为重要。我们通过这一观察,通过在这种观测中,我们建立稳定的MMT(即实现MMT)最低水平,同时将ED理论化模型与直接将IL级数据转换为C的模型,同时将C-I-I-I-I-I-I-I-I-I-I-IL水平,同时将这种理论水平的学习模型与E-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I