Advanced computational methods are being actively sought for addressing the challenges associated with discovery and development of new combinatorial material such as formulations. A widely adopted approach involves domain informed high-throughput screening of individual components that can be combined into a formulation. This manages to accelerate the discovery of new compounds for a target application but still leave the process of identifying the right 'formulation' from the shortlisted chemical space largely a laboratory experiment-driven process. We report a deep learning model, Formulation Graph Convolution Network (F-GCN), that can map structure-composition relationship of the individual components to the property of liquid formulation as whole. Multiple GCNs are assembled in parallel that featurize formulation constituents domain-intuitively on the fly. The resulting molecular descriptors are scaled based on respective constituent's molar percentage in the formulation, followed by formalizing into a combined descriptor that represents a complete formulation to an external learning architecture. The use case of proposed formulation learning model is demonstrated for battery electrolytes by training and testing it on two exemplary datasets representing electrolyte formulations vs battery performance -- one dataset is sourced from literature about Li/Cu half-cells, while the other is obtained by lab-experiments related to lithium-iodide full-cell chemistry. The model is shown to predict the performance metrics like Coulombic Efficiency (CE) and specific capacity of new electrolyte formulations with lowest reported errors. The best performing F-GCN model uses molecular descriptors derived from molecular graphs that are informed with HOMO-LUMO and electric moment properties of the molecules using a knowledge transfer technique.
翻译:暂无翻译