SPLDExtratraTrees:用于预测动脉抑制剂抗药性的强有力的机器学习方法 (SPLDExtraTrees: Robust machine learning approach for predicting kinase inhibitor resistance)

Drug resistance is a major threat to the global health and a significant concern throughout the clinical treatment of diseases and drug development. The mutation in proteins that is related to drug binding is a common cause for adaptive drug resistance. Therefore, quantitative estimations of how mutations would affect the interaction between a drug and the target protein would be of vital significance for the drug development and the clinical practice. Computational methods that rely on molecular dynamics simulations, Rosetta protocols, as well as machine learning methods have been proven to be capable of predicting ligand affinity changes upon protein mutation. However, the severely limited sample size and heavy noise induced overfitting and generalization issues have impeded wide adoption of machine learning for studying drug resistance. In this paper, we propose a robust machine learning method, termed SPLDExtraTrees, which can accurately predict ligand binding affinity changes upon protein mutation and identify resistance-causing mutations. Especially, the proposed method ranks training data following a specific scheme that starts with easy-to-learn samples and gradually incorporates harder and diverse samples into the training, and then iterates between sample weight recalculations and model updates. In addition, we calculate additional physics-based structural features to provide the machine learning model with the valuable domain knowledge on proteins for this data-limited predictive tasks. The experiments substantiate the capability of the proposed method for predicting kinase inhibitor resistance under three scenarios, and achieves predictive accuracy comparable to that of molecular dynamics and Rosetta methods with much less computational costs.

翻译：药物抗药性是全球健康的一大威胁,也是整个疾病临床治疗和药物发展过程中一个重大关切问题。与药物结合有关的蛋白质突变是适应性抗药性的共同原因。因此,对药物与目标蛋白之间相互作用的突变如何影响药物与目标蛋白之间的相互作用进行定量估计,对于药物发展和临床实践至关重要。依赖分子动态模拟的计算方法、Rosetta规程以及机器学习方法已证明能够预测蛋白突变时的离心性和亲近性变化。然而,与药物结合有关的蛋白质的样本规模极为有限,且噪音过大,妨碍广泛采用机器学习研究抗药性的方法。在本论文中,我们提出了一种强有力的机器学习方法,称为SPLDExtraTrees,可以准确地预测蛋白突变的离心性变化,并查明耐药性突变。特别是,拟议方法将培训数据排序遵循一个具体的计划,从易于阅读的样本开始,并逐渐将更多样本纳入培训中,然后将模型用于研究耐药性重的精度的精确度和精确性计算方法。根据模型,根据模型计算模型计算,以模型计算,以基本的精确度计算,从而测量和精确地计算,以精确的精确度数据,根据模型计算方法进行。在精确度计算,以精确性方法更新。根据模型计算,以精确性方法计算,为三种方法计算,以精确性方法计算,以精确性方法更新。在模型方法计算,以精确性方法进行。在计算,以精确性方法进行。进行。

相关内容

Machine Learning

关注 2245

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日