Thanks to the increasing availability of genomics and other biomedical data, many machine learning approaches have been proposed for a wide range of therapeutic discovery and development tasks. In this survey, we review the literature on machine learning applications for genomics through the lens of therapeutic development. We investigate the interplay among genomics, compounds, proteins, electronic health records (EHR), cellular images, and clinical texts. We identify twenty-two machine learning in genomics applications across the entire therapeutics pipeline, from discovering novel targets, personalized medicine, developing gene-editing tools all the way to clinical trials and post-market studies. We also pinpoint seven important challenges in this field with opportunities for expansion and impact. This survey overviews recent research at the intersection of machine learning, genomics, and therapeutic development.
翻译:由于基因组学和其他生物医学数据越来越多,为一系列广泛的治疗发现和发展任务提出了许多机器学习方法。在这次调查中,我们从治疗发展的角度审查基因组学的机器学习应用文献。我们调查基因组学、化合物、蛋白质、电子健康记录(EHR)、细胞图象和临床文本之间的相互作用。我们查明了整个治疗管道的基因组学应用中的22个机器学习方法,从发现新目标、个性化医学、开发基因编辑工具一直到临床试验和市场后研究。我们还确定了该领域的7个重要挑战,以及扩展和影响的机会。这一调查概述了最近对机器学习、基因组学和治疗发展之间的交叉研究。