CODE-AE:从细胞线转基因体预测病人具体药物反应的统一解构自动编码器 (CODE-AE: A Coherent De-confounding Autoencoder for Predicting Patient-Specific Drug Response From Cell Line Transcriptomics)

Accurate and robust prediction of patient's response to drug treatments is critical for developing precision medicine. However, it is often difficult to obtain a sufficient amount of coherent drug response data from patients directly for training a generalized machine learning model. Although the utilization of rich cell line data provides an alternative solution, it is challenging to transfer the knowledge obtained from cell lines to patients due to various confounding factors. Few existing transfer learning methods can reliably disentangle common intrinsic biological signals from confounding factors in the cell line and patient data. In this paper, we develop a Coherent Deconfounding Autoencoder (CODE-AE) that can extract both common biological signals shared by incoherent samples and private representations unique to each data set, transfer knowledge learned from cell line data to tissue data, and separate confounding factors from them. Extensive studies on multiple data sets demonstrate that CODE-AE significantly improves the accuracy and robustness over state-of-the-art methods in both predicting patient drug response and de-confounding biological signals. Thus, CODE-AE provides a useful framework to take advantage of in vitro omics data for developing generalized patient predictive models. The source code is available at https://github.com/XieResearchGroup/CODE-AE.

翻译：准确和可靠地预测病人对药物治疗的反应对于发展精密医学至关重要,然而,通常很难直接从病人那里获得足够数量的一致药物反应数据,以便直接培训通用机器学习模型。尽管丰富的细胞线数据提供了替代解决方案,但利用丰富的细胞线数据将细胞线上获得的知识转让给病人是困难因素,很少有现有的转移学习方法能够可靠地分解细胞线和病人数据中混杂因素的共同内在生物信号。在本文中,我们开发了一个固执的解析解析自动编码器(CODE-AE),它能够提取由不连贯的样本和每个数据集独有的私人代表所共享的共同生物信号,将从细胞线数据中获取的知识转移给组织数据,并将这些因素分开。对多个数据集的广泛研究表明,CODE-AE在预测病人药物反应和脱混固生物信号方面,大大改进了状态方法的准确性和稳健性。因此,CODE-AE提供了一个有用的框架,用以在普遍化的化学/再分析源中利用现有的耐心模型。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

实体关系抽取方法研究综述

专知会员服务

178+阅读 · 2020年7月19日