Optimizing combinatorial structures is core to many real-world problems, such as those encountered in life sciences. For example, one of the crucial steps involved in antibody design is to find an arrangement of amino acids in a protein sequence that improves its binding with a pathogen. Combinatorial optimization of antibodies is difficult due to extremely large search spaces and non-linear objectives. Even for modest antibody design problems, where proteins have a sequence length of eleven, we are faced with searching over 2.05 x 10^14 structures. Applying traditional Reinforcement Learning algorithms such as Q-learning to combinatorial optimization results in poor performance. We propose Structured Q-learning (SQL), an extension of Q-learning that incorporates structural priors for combinatorial optimization. Using a molecular docking simulator, we demonstrate that SQL finds high binding energy sequences and performs favourably against baselines on eight challenging antibody design tasks, including designing antibodies for SARS-COV.
翻译:优化组合结构是许多现实世界问题的核心,例如生命科学中遇到的问题。例如,抗体设计的关键步骤之一是在蛋白质序列中找到氨基酸的安排,以改进其与病原体的结合。由于搜索空间非常大和非线性目标,混合优化抗体是困难的。即使微小的抗体设计问题,即蛋白质的序列长度为11,我们也面临着超过2.05 x 10°14的搜索。应用传统的强化学习算法,如Q学习来组合优化性能不佳的结果。我们提议结构化的Q学习(SQL),这是包含组合优化结构前程的Q学习的延伸。我们使用分子对接模拟器,证明SQL发现高的束缚性能源序列,并且对八项挑战性反体设计任务的基线表现良好,包括为SAS-COV设计抗体。