Chemical reactions are the fundamental building blocks of drug design and organic chemistry research. Machine learning for chemistry is a rapidly advancing field with numerous applications. In recent years, there has been a growing need for a large-scale deep-learning framework that can efficiently capture the basic rules of chemical reactions. In this paper, we have proposed a unified framework that addresses both the reaction representation learning and molecule generation tasks, which allows for a more holistic approach. Inspired by the organic chemistry mechanism, we develop a novel pretraining framework that enables us to incorporate inductive biases into the model. Our framework achieves state-of-the-art results on challenging downstream tasks. By possessing chemical knowledge, this framework can be applied to reaction-based generative models, overcoming the limitations of current molecule generation models that rely on a small number of reaction templates. In the extensive experiments, our model generates synthesizable drug-like structures of high quality. Overall, our work presents a significant step toward a large-scale deep-learning framework for a variety of reaction-based applications.
翻译:化学反应是药物设计和有机化学研究的基本构件。化学机器学习是一个迅速进步的领域,有许多应用。近年来,越来越需要一个大型深层学习框架,能够有效地捕捉化学反应的基本规则。在本文件中,我们提出了一个统一框架,既处理反应代表学习,又处理分子生成任务,从而可以采取更全面的方法。在有机化学机制的启发下,我们开发了一个新的培训前框架,使我们能够将进化偏向纳入模型。我们的框架在挑战性下游任务方面取得了最先进的成果。通过拥有化学知识,这个框架可以应用于基于反应的基因化模型,克服目前依靠少量反应模板的分子生成模型的局限性。在广泛的实验中,我们的模式产生了可合成的高质量药物结构。总体而言,我们的工作为各种反应性应用的大规模深层次学习框架迈出了一大步子。</s>