We introduce AbBiBench (Antibody Binding Benchmarking), a benchmarking framework for antibody binding affinity maturation and design. Unlike existing antibody evaluation strategies that rely on antibody alone and its similarity to natural ones (e.g., amino acid identity rate, structural RMSD), AbBiBench considers an antibody-antigen (Ab-Ag) complex as a functional unit and evaluates the potential of an antibody design binding to given antigen by measuring protein model's likelihood on the Ab-Ag complex. We first curate, standardize, and share 9 datasets containing 9 antigens (involving influenza, anti-lysozyme, HER2, VEGF, integrin, and SARS-CoV-2) and 155,853 heavy chain mutated antibodies. Using these datasets, we systematically compare 14 protein models including masked language models, autoregressive language models, inverse folding models, diffusion-based generative models, and geometric graph models. The correlation between model likelihood and experimental affinity values is used to evaluate model performance. Additionally, in a case study to increase binding affinity of antibody F045-092 to antigen influenza H1N1, we evaluate the generative power of the top-performing models by sampling a set of new antibodies binding to the antigen and ranking them based on structural integrity and biophysical properties of the Ab-Ag complex. As a result, structure-conditioned inverse folding models outperform others in both affinity correlation and generation tasks. Overall, AbBiBench provides a unified, biologically grounded evaluation framework to facilitate the development of more effective, function-aware antibody design models.
翻译:我们提出了AbBiBench(抗体结合基准测试框架),这是一个用于抗体结合亲和力成熟与设计的基准测试框架。与现有仅依赖抗体本身及其与天然抗体的相似性(例如氨基酸同源性、结构RMSD)的抗体评估策略不同,AbBiBench将抗体-抗原(Ab-Ag)复合物视为一个功能单元,并通过测量蛋白质模型在Ab-Ag复合物上的似然性来评估抗体设计结合给定抗原的潜力。我们首先整理、标准化并共享了9个数据集,包含9种抗原(涉及流感病毒、抗溶菌酶、HER2、VEGF、整合素和SARS-CoV-2)和155,853个重链突变抗体。利用这些数据集,我们系统性地比较了14种蛋白质模型,包括掩码语言模型、自回归语言模型、逆折叠模型、基于扩散的生成模型和几何图模型。模型似然性与实验亲和力值之间的相关性被用于评估模型性能。此外,在一个旨在提高抗体F045-092与抗原流感病毒H1N1结合亲和力的案例研究中,我们通过采样一组结合该抗原的新抗体,并根据Ab-Ag复合物的结构完整性和生物物理特性对其进行排序,来评估表现最佳模型的生成能力。结果表明,结构条件化的逆折叠模型在亲和力相关性和生成任务中均优于其他模型。总体而言,AbBiBench提供了一个统一的、基于生物学的评估框架,以促进开发更有效、功能感知的抗体设计模型。