Recently, deep learning approaches have been extensively studied for various problems in chemistry, such as property prediction, virtual screening, de novo molecule design, etc. Despite the impressive successes, separately designed networks for specific tasks are usually required for end-to-end training, so it is often difficult to acquire a unified principle to synergistically combine existing models and training datasets for novel tasks. To address this, here we present a novel multimodal chemical foundation model that can be used for various downstream tasks that require a simultaneous understanding of structure and property. Specifically, inspired by recent advances in pre-trained multi-modal foundation models such as Vision-Language Pretrained models (VLP), we proposed a novel structure-property multi-modal (SPMM) foundation model using the dual-stream transformer with X-shape attention, so that it can align the molecule structure and the chemical properties in a common embedding space. Thanks to the outstanding structure-property unimodal representation, experimental results confirm that SPMM can simultaneously perform molecule generation, property prediction, classification, reaction prediction, etc., which was previously not possible with a single architecture.
翻译:最近,对化学领域的各种问题,例如财产预测、虚拟筛选、脱新分子设计等,广泛研究了深层次的学习方法。尽管取得了令人瞩目的成功,但通常需要为终端到终端培训专门设计具体任务的不同设计网络,因此往往难以获得一种统一的原则,将现有的模型和培训数据集协同地结合到新的任务中。为了解决这个问题,我们在这里提出了一个新的多式化学基础模型,可用于各种下游任务,需要同时理解结构和财产。具体地说,在Vision-Language预培训模型(VLP)等预先培训的多模式基础模型的最新进展的启发下,我们提出了使用双流变压器和X形状注意的新型结构-财产多模式(SPMM)基础模型,这样它就可以将分子结构和化学特性与共同嵌入空间的化学特性相协调。由于杰出的结构-异式的单一形式代表,实验结果证实SPMM能够同时进行分子生成、财产预测、分类、反应预测等,而以前不可能与单一结构同时进行。