Inverse folding models have proven to be highly effective zero-shot predictors of protein stability. Despite this success, the link between the amino acid preferences of an inverse folding model and the free-energy considerations underlying thermodynamic stability remains incompletely understood. A better understanding would be of interest not only from a theoretical perspective, but also potentially provide the basis for stronger zero-shot stability prediction. In this paper, we take steps to clarify the free-energy foundations of inverse folding models. Our derivation reveals the standard practice of likelihood ratios as a simplistic approximation and suggests several paths towards better estimates of the relative stability. We empirically assess these approaches and demonstrate that considerable gains in zero-shot performance can be achieved with fairly simple means.
翻译:逆折叠模型已被证明是蛋白质稳定性的高效零样本预测器。尽管取得了这一成功,逆折叠模型的氨基酸偏好与热力学稳定性背后的自由能考量之间的联系仍未得到完全理解。更好地理解这种联系不仅具有理论意义,还可能为更强大的零样本稳定性预测奠定基础。在本文中,我们着手阐明逆折叠模型的自由能基础。我们的推导揭示了将似然比作为标准实践是一种过于简化的近似,并提出了几条改进相对稳定性估计的路径。我们通过实证评估了这些方法,并证明通过相当简单的手段即可在零样本性能上实现显著提升。