Gryffin:利用专家知识优化巴伊西亚绝对变量的算法 (Gryffin: An algorithm for Bayesian optimization of categorical variables informed by expert knowledge)

Designing functional molecules and advanced materials requires complex design choices: tuning continuous process parameters such as temperatures or flow rates, while simultaneously selecting catalysts or solvents. To date, the development of data-driven experiment planning strategies for autonomous experimentation has largely focused on continuous process parameters despite the urge to devise efficient strategies for the selection of categorical variables. Here, we introduce Gryffin, a general purpose optimization framework for the autonomous selection of categorical variables driven by expert knowledge. Gryffin augments Bayesian optimization based on kernel density estimation with smooth approximations to categorical distributions. Leveraging domain knowledge in the form of physicochemical descriptors, Gryffin can significantly accelerate the search for promising molecules and materials. Gryffin can further highlight relevant correlations between the provided descriptors to inspire physical insights and foster scientific intuition. In addition to comprehensive benchmarks, we demonstrate the capabilities and performance of Gryffin on three examples in materials science and chemistry: (i) the discovery of non-fullerene acceptors for organic solar cells, (ii) the design of hybrid organic-inorganic perovskites for light harvesting, and (iii) the identification of ligands and process parameters for Suzuki-Miyaura reactions. Our results suggest that Gryffin, in its simplest form, is competitive with state-of-the-art categorical optimization algorithms. However, when leveraging domain knowledge provided via descriptors, Gryffin outperforms other approaches while simultaneously refining this domain knowledge to promote scientific understanding.

翻译：设计功能分子和高级材料需要复杂的设计选择:调整温度或流动率等连续过程参数,同时选择催化剂或溶剂;迄今为止,制定数据驱动的自主实验实验规划战略主要侧重于连续过程参数,尽管迫切需要为选择绝对变量制定有效的战略;在这里,我们引入了Gryffin,这是自主选择由专家知识驱动的绝对变量的一般目的优化框架;Gryffin以直径分布的平稳近似值,根据内核密度估计增强巴伊西亚优化;利用物理化学描述器形式的域知识,Gryffin可以大大加快寻找有希望的分子和材料;Gryffin可以进一步强调所提供的描述器之间的相关关联,以激发物理洞见和促进科学直觉;除了综合基准外,我们还展示了Gryffin在材料科学和化学的三个例子上的能力和性能:(一) 发现有机太阳能电池的非全面接受器,(二) 设计用于光采的物理化学描述仪,Gryffifferente的域知识,以及(三) 以Sal-ral-ralalalal-ralalalalal assessalal assalalalal assessal-degraphal-al-deal-al-al-al-al-deal-deal-degraphal) as-deal-degraphal-vial-vial-sal-sal-deal-view-view-view-view-view-viewsmal-viewsmal-vial-vial-vial-view-view-view-view-view-vial-vical-vical-s-s-vial-vial-s-vial-vial-s-vial-vial-vial-vial-vial-vial-vial-vical-li-li-vial-vical-vial-vial-vial-vial-vial-vial-vical-vical-vical-vical-vical-vial-vial-vial-vial-vical-vical-vical-vical-vical-vical-vial-vial-s-