Mitigating the climate crisis requires a rapid transition towards lower carbon energy. Catalyst materials play a crucial role in the electrochemical reactions involved in a great number of industrial processes key to this transition, such as renewable energy storage and electrofuel synthesis. To reduce the amount of energy spent on such processes, we must quickly discover more efficient catalysts to drive the electrochemical reactions. Machine learning (ML) holds the potential to efficiently model the properties of materials from large amounts of data, and thus to accelerate electrocatalyst design. The Open Catalyst Project OC20 data set was constructed to that end. However, most existing ML models trained on OC20 are still neither scalable nor accurate enough for practical applications. Here, we propose several task-specific innovations, applicable to most architectures, which increase both computational efficiency and accuracy. In particular, we propose improvements in (1) the graph creation step, (2) atom representations and (3) the energy prediction head. We describe these contributions and evaluate them on several architectures, showing up to 5$\times$ reduction in inference time without sacrificing accuracy.
翻译:减缓气候危机要求迅速向低碳能源过渡。催化材料在这种转变的关键工业过程所涉及的大量电化学反应中发挥着关键作用,例如可再生能源储存和电动燃料合成。为了减少用于这些过程的能源量,我们必须迅速发现更高效的催化剂来推动电化学反应。机器学习(ML)具有从大量数据中高效建模材料特性的潜力,从而加快电催化剂设计。开放催化项目OC20数据集是为此目的建造的。然而,大多数在OC20上培训的现有ML模型仍然不够可缩放,也不够精确,无法用于实际应用。我们在这里提出了适用于大多数结构的任务性创新,既能提高计算效率和准确性。特别是,我们建议改进(1) 图表制作步骤,(2) 原子表和(3) 能源预测头。我们对这些贡献进行描述,并评估若干结构,显示在不牺牲准确性的情况下,降幅高达5美元的时间。