Constrained synthesizability is an unaddressed challenge in generative molecular design. In particular, designing molecules satisfying multi-parameter optimization objectives, while simultaneously being synthesizable and enforcing the presence of specific commercial building blocks in the synthesis. This is practically important for molecule re-purposing, sustainability, and efficiency. In this work, we propose a novel reward function called TANimoto Group Overlap (TANGO), which uses chemistry principles to transform a sparse reward function into a dense and learnable reward function -- crucial for reinforcement learning. TANGO can augment general-purpose molecular generative models to directly optimize for constrained synthesizability while simultaneously optimizing for other properties relevant to drug discovery using reinforcement learning. Our framework is general and addresses starting-material, intermediate, and divergent synthesis constraints. Contrary to most existing works in the field, we show that incentivizing a general-purpose (without any inductive biases) model is a productive approach to navigating challenging optimization scenarios. We demonstrate this by showing that the trained models explicitly learn a desirable distribution. Our framework is the first generative approach to tackle constrained synthesizability.
翻译:暂无翻译