Learning personalized cancer treatment with machine learning holds great promise to improve cancer patients' chance of survival. Despite recent advances in machine learning and precision oncology, this approach remains challenging as collecting data in preclinical/clinical studies for modeling multiple treatment efficacies is often an expensive, time-consuming process. Moreover, the randomization in treatment allocation proves to be suboptimal since some participants/samples are not receiving the most appropriate treatments during the trial. To address this challenge, we formulate drug screening study as a "contextual bandit" problem, in which an algorithm selects anticancer therapeutics based on contextual information about cancer cell lines while adapting its treatment strategy to maximize treatment response in an "online" fashion. We propose using a novel deep Bayesian bandits framework that uses functional prior to approximate posterior for drug response prediction based on multi-modal information consisting of genomic features and drug structure. We empirically evaluate our method on three large-scale in vitro pharmacogenomic datasets and show that our approach outperforms several benchmarks in identifying optimal treatment for a given cell line.
翻译:通过机器学习学习个人化的癌症治疗,对于改善癌症患者的生存机会有着巨大的希望。尽管在机器学习和精确肿瘤学方面最近有所进展,但这一方法仍然具有挑战性,因为为模拟多种治疗效果而收集临床/临床前期研究的数据往往是一个昂贵、耗时的过程。此外,治疗分配的随机化证明是不完美的,因为一些参与者/样本在试验期间没有得到最适当的治疗。为了应对这一挑战,我们将药物筛选研究作为“原始突袭”问题,在这种研究中,一种算法根据有关癌症细胞线的背景信息选择抗癌治疗方法,同时调整其治疗战略,以“在线”方式尽量扩大治疗反应。我们提议使用新的深海湾强盗框架,在利用由基因特征和药物结构构成的多模式信息进行药物反应预测之前使用近似远地点。我们实证地评估了我们关于三个大规模病毒药理学基因组数据集的方法,并表明我们的方法在确定特定细胞线的最佳治疗方面超过了几个基准。