Neural Networks (GNNs) have revolutionized the molecular discovery to understand patterns and identify unknown features that can aid in predicting biophysical properties and protein-ligand interactions. However, current models typically rely on 2-dimensional molecular representations as input, and while utilization of 2\3- dimensional structural data has gained deserved traction in recent years as many of these models are still limited to static graph representations. We propose a novel approach based on the transformer model utilizing GNNs for characterizing dynamic features of protein-ligand interactions. Our message passing transformer pre-trains on a set of molecular dynamic data based off of physics-based simulations to learn coordinate construction and make binding probability and affinity predictions as a downstream task. Through extensive testing we compare our results with the existing models, our MDA-PLI model was able to outperform the molecular interaction prediction models with an RMSE of 1.2958. The geometric encodings enabled by our transformer architecture and the addition of time series data add a new dimensionality to this form of research.
翻译:神经网络(GNNs)使分子发现革命化,以了解模式并确定有助于预测生物物理特性和蛋白-离子相互作用的未知特征。然而,目前的模型通常依赖2维分子表示作为投入,而2/3维结构数据的利用近年来获得了应有的牵引力,因为许多这些模型仍然局限于静态图形表示。我们提议基于变压器模型的新办法,利用GNNs将蛋白-离子和相互作用的动态特征定性。我们的信息传递变压器前的轨迹,以基于物理模拟的一组分子动态数据为基础,学习协调构建,并作为一项下游任务,作出捆绑性概率和亲近性预测。通过广泛测试,我们将我们的成果与现有模型进行比较,我们的MDA-PLI模型能够超越分子互动预测模型与1.2958号ARME的模型。我们变压器结构所促成的几何编码和时间序列数据的增加为这一研究形式增添了新的维度。