In drug discovery, identifying drug-target interactions (DTIs) via experimental approaches is a tedious and expensive procedure. Computational methods efficiently predict DTIs and recommend a small part of potential interacting pairs for further experimental confirmation, accelerating the drug discovery process. Area under the precision-recall curve (AUPR) that emphasizes the accuracy of top-ranked pairs and area under the receiver operating characteristic curve (AUC) that heavily punishes the existence of low ranked interacting pairs are two widely used evaluation metrics in the DTI prediction task. However, the two metrics are seldom considered as losses within existing DTI prediction methods. This paper proposes two matrix factorization methods that optimize AUPR and AUC, respectively. The two methods utilize graph regularization to ensure the local invariance of training drugs and targets in the latent feature space, and leverage the optimal decay coefficient to infer more reliable latent features of new drugs and targets. Experimental results over four updated benchmark datasets containing more recently verified interactions show the superiority of the proposed methods in terms of the corresponding evaluation metric they optimize.
翻译:在药物发现方面,通过实验方法确定药物目标互动是一个繁琐而昂贵的程序。计算方法有效地预测了DTI,并建议了一小部分潜在的互动对口,以便进一步实验确认,加速药物发现过程。精确回调曲线下的地区强调顶级对口的准确性,接受者操作特征曲线下地区严重抑制低级互动对口的存在,这是DTI预测工作中广泛使用的两种评价指标。但是,在现有的DTI预测方法中,这两种指标很少被视为损失。本文分别提出了优化AUP和AUC的两种矩阵化因子方法。两种方法使用图表正规化方法,以确保潜在特征空间的培训药物和目标在当地的波动性,并利用最佳衰减系数推断出新药物和目标的更可靠的潜在特征。四个更新的基准数据集的实验结果显示,最新核实的相互作用表明拟议方法在相应评价指标方面具有优势。