Ionic Liquids (ILs) provide a promising solution for CO$_2$ capture and storage to mitigate global warming. However, identifying and designing the high-capacity IL from the giant chemical space requires expensive, and exhaustive simulations and experiments. Machine learning (ML) can accelerate the process of searching for desirable ionic molecules through accurate and efficient property predictions in a data-driven manner. But existing descriptors and ML models for the ionic molecule suffer from the inefficient adaptation of molecular graph structure. Besides, few works have investigated the explainability of ML models to help understand the learned features that can guide the design of efficient ionic molecules. In this work, we develop both fingerprint-based ML models and Graph Neural Networks (GNNs) to predict the CO$_2$ absorption in ILs. Fingerprint works on graph structure at the feature extraction stage, while GNNs directly handle molecule structure in both the feature extraction and model prediction stage. We show that our method outperforms previous ML models by reaching a high accuracy (MAE of 0.0137, $R^2$ of 0.9884). Furthermore, we take the advantage of GNNs feature representation and develop a substructure-based explanation method that provides insight into how each chemical fragments within IL molecules contribute to the CO$_2$ absorption prediction of ML models. We also show that our explanation result agrees with some ground truth from the theoretical reaction mechanism of CO$_2$ absorption in ILs, which can advise on the design of novel and efficient functional ILs in the future.
翻译:液溶液(ILs)为CO$-2美元的捕获和储存提供了一个大有希望的解决方案,以减缓全球变暖。然而,从巨型化学空间查明和设计高容量的IL需要花费昂贵和详尽的模拟和实验。机器学习(ML)可以通过数据驱动的准确有效的属性预测加快寻找理想的离子分子的过程。但是,对于离子分子的现有描述仪和ML模型,由于分子图结构的适应效率低下而受到影响。此外,很少有工作调查ML模型的可解释性,以帮助理解能够指导高效离子分子设计的学习特点。在这项工作中,我们开发了基于指纹的ML模型模型和图形神经网络(GNNS)模型和图形神经网络(GNNS),以预测在IL中的二氧化碳吸收成本。GNS在特征提取阶段的图形结构中直接处理分子结构,同时在特性提取和模型预测阶段直接处理分子结构。我们的方法超越了以前的ML模型,达到了高精确度(ML值为0.037美元,I2美元,IL值$0.9884美元)。此外,我们开发了功能模型的模型,我们在GNNL模型中也为GMS的深度分析提供了对GNF的精确结构的优势。