The approval success rate of drug candidates is very low with the majority of failure due to safety and efficacy. Increasingly available high dimensional information on targets, drug molecules and indications provides an opportunity for ML methods to integrate multiple data modalities and better predict clinically promising drug targets. Notably, drug targets with human genetics evidence are shown to have better odds to succeed. However, a recent tensor factorization-based approach found that additional information on targets and indications might not necessarily improve the predictive accuracy. Here we revisit this approach by integrating different types of human genetics evidence collated from publicly available sources to support each target-indication pair. We use Bayesian tensor factorization to show that models incorporating all available human genetics evidence (rare disease, gene burden, common disease) modestly improves the clinical outcome prediction over models using single line of genetics evidence. We provide additional insight into the relative predictive power of different types of human genetics evidence for predicting the success of clinical outcomes.
翻译:有关目标、药物分子和迹象的高度信息日益普及,为ML综合多种数据模式和更好地预测临床上有希望的药物目标提供了一个机会。值得注意的是,与人类基因证据有关的药物目标显示有更大的成功概率。然而,最近采用以数种因素为基础的方法发现,关于目标和迹象的额外信息不一定能够提高预测准确性。我们在这里回顾这一方法,将从公开来源收集的不同种类人类遗传证据结合起来,以支持每个目标指标说明配对。我们利用Bayesian 数十个因素来显示,将所有现有人类遗传证据(疾病、基因负担、常见疾病)纳入模型的模型对使用单一基因证据模型的临床结果预测略有改进。我们进一步深入了解了不同类型人类遗传证据的相对预测能力,以预测临床结果的成功。