Transition metal chromophores with earth-abundant transition metals are an important design target for their applications in lighting and non-toxic bioimaging, but their design is challenged by the scarcity of complexes that simultaneously have optimal target absorption energies in the visible region as well as well-defined ground states. Machine learning (ML) accelerated discovery could overcome such challenges by enabling screening of a larger space, but is limited by the fidelity of the data used in ML model training, which is typically from a single approximate density functional. To address this limitation, we search for consensus in predictions among 23 density functional approximations across multiple rungs of Jacobs ladder. To accelerate the discovery of complexes with absorption energies in the visible region while minimizing MR character, we use 2D efficient global optimization to sample candidate low-spin chromophores from multi-million complex spaces. Despite the scarcity (i.e., approx. 0.01\%) of potential chromophores in this large chemical space, we identify candidates with high likelihood (i.e., > 10\%) of computational validation as the ML models improve during active learning, representing a 1,000-fold acceleration in discovery. Absorption spectra of promising chromophores from time-dependent density functional theory verify that 2/3 of candidates have the desired excited state properties. The observation that constituent ligands from our leads have demonstrated interesting optical properties in the literature exemplifies the effectiveness of our construction of a realistic design space and active learning approach.
翻译:机械学习(ML)加速发现可以通过对更大的空间进行筛选来克服这些挑战,但受到ML模型培训中通常来自单一近似密度功能的数据的忠实性的限制。为了应对这一局限性,我们寻求在Jacobs多层阶梯的23个密度功能近似值之间作出预测的共识。为了加速在可见区域发现具有吸收能量的复合物,同时尽量减少MR特性,我们利用2D高效全球优化对来自数百万复杂空间的候选低脊椎染色素样本进行抽样检查。尽管在大型化学空间的潜在染色素缺乏(即,约0.01 ⁇ ),但我们在这种大化学空间中发现的潜在染色素缺乏(即,通常来自单一近似密度功能的功能)。为了应对这一局限性,我们寻求共识,在Jacobs阶梯多层阶梯的23个密度功能性功能近似近似近似近。为了加快在可见区域发现具有吸收能量的复杂吸收能量的复合体,同时尽量减少MRMR3的复合体,我们利用2全球高效优化全球优化优化的样本样本样本样本样本样本样本样本样本样本样本采集的精确性研究,从而在积极学习中,从而能加速地加速进行。