There are numerous peptides discovered through past decades, which exhibit antimicrobial and anti-cancerous tendencies. Due to these reasons, peptides are supposed to be sound therapeutic candidates. Some peptides can pose low metabolic stability, high toxicity and high hemolity of peptides. This highlights the importance for evaluating hemolytic tendencies and toxicity of peptides, before using them for therapeutics. Traditional methods for evaluation of toxicity of peptides can be time-consuming and costly. In this study, we have extracted peptides data (Hemo-DB) from Database of Antimicrobial Activity and Structure of Peptides (DBAASP) based on certain hemolity criteria and we present a machine learning based method for prediction of hemolytic tendencies of peptides (i.e. Hemolytic or Non-Hemolytic). Our model offers significant improvement on hemolity prediction benchmarks. we also propose a reliable clustering-based train-tests splitting method which ensures that no peptide in train set is more than 40% similar to any peptide in test set. Using this train-test split, we can get reliable estimated of expected model performance on unseen data distribution or newly discovered peptides. Our model tests 0.9986 AUC-ROC (Area Under Receiver Operating Curve) and 97.79% Accuracy on test set of Hemo-DB using traditional random train-test splitting method. Moreover, our model tests AUC-ROC of 0.997 and Accuracy of 97.58% while using clustering-based train-test data split. Furthermore, we check our model on an unseen data distribution (at Hemo-PI 3) and we recorded 0.8726 AUC-ROC and 79.5% accuracy. Using the proposed method, potential therapeutic peptides can be screened, which may further in therapeutics and get reliable predictions for unseen amino acids distribution of peptides and newly discovered peptides.


翻译:过去几十年中发现了许多止痛药,这些病原体显示的是抗微生物和抗慢性病趋势。由于这些原因,止痛药应该是健康的治疗对象。一些止痛药可能具有低代谢稳定性、高毒性和高异性激素。这凸显了评价止痛药血液倾向和毒性的重要性,然后才用于治疗。对止痛药毒性的传统评估方法可能耗费时间和成本。在这项研究中,我们从抗微生物性C(Hemo-DB)数据库中提取了止痛剂数据。由于这些原因,止痛药应该是健康的治疗对象。一些止痛药可能具有低代谢性稳定性、高毒性和高血激素高异性。我们的模式可以大大改进止痛剂的预测基准。我们还提议一种可靠的基于组织-止痛药的测试方法,确保火车机原的止痛剂(Hemo-DB)活动和止痛剂结构(DBA-C)基于某种血原体测试方法的预测性能(A.

0
下载
关闭预览

相关内容

机器学习系统设计系统评估标准
Linux导论,Introduction to Linux,96页ppt
专知会员服务
79+阅读 · 2020年7月26日
强化学习最新教程,17页pdf
专知会员服务
177+阅读 · 2019年10月11日
【新书】Python编程基础,669页pdf
专知会员服务
195+阅读 · 2019年10月10日
最新BERT相关论文清单,BERT-related Papers
专知会员服务
53+阅读 · 2019年9月29日
Transferring Knowledge across Learning Processes
CreateAMind
28+阅读 · 2019年5月18日
Hierarchical Disentangled Representations
CreateAMind
4+阅读 · 2018年4月15日
已删除
将门创投
5+阅读 · 2018年1月24日
Arxiv
3+阅读 · 2018年2月20日
VIP会员
相关资讯
Transferring Knowledge across Learning Processes
CreateAMind
28+阅读 · 2019年5月18日
Hierarchical Disentangled Representations
CreateAMind
4+阅读 · 2018年4月15日
已删除
将门创投
5+阅读 · 2018年1月24日
Top
微信扫码咨询专知VIP会员