XOMIVAE:使用高维显微镜数据进行癌症分类的可解释的深层学习模型 (XOmiVAE: an interpretable deep learning model for cancer classification using high-dimensional omics data)

Deep learning based approaches have proven promising to model omics data. However, one of the current limitations compared to statistical and traditional machine learning approaches is the lack of explainability, which not only reduces the reliability, but limits the potential for acquiring novel knowledge from unpicking the "black-box" models. Here we present XOmiVAE, a novel interpretable deep learning model for cancer classification using high-dimensional omics data. XOmiVAE is able to obtain contribution values of each gene and latent dimension for a specific prediction, and the correlation between genes and the latent dimensions. It is also revealed that XOmiVAE can explain both the supervised classification and the unsupervised clustering results from the deep learning network. To the best of our knowledge, XOmiVAE is one of the first activated-based deep learning interpretation method to explain novel clusters generated by variational autoencoders. The results generated by XOmiVAE were validated by both the biomedical knowledge and the performance of downstream tasks. XOmiVAE explanations of deep learning based cancer classification and clustering aligned with current domain knowledge including biological annotation and literature, which shows great potential for novel biomedical knowledge discovery from deep learning models. The top XOmiVAE selected genes and dimensions shown significant influence to the performance of cancer classification. Additionally, we offer important steps to consider when interpreting deep learning models for tumour classification. For instance, we demonstrate the importance of choosing background samples that makes biological sense and the limitations of connection weight based methods to explain latent dimensions.

翻译：然而,与统计和传统机器学习方法相比,目前与统计和传统机器学习方法相比的局限性之一是缺乏解释性,这不仅降低了可靠性,而且限制了从不选“黑箱”模型中获取新知识的可能性。这里我们展示了XOmiVAE,这是使用高维显微镜数据进行癌症分类的可解释的深层次深层次学习新模式。XOmiVAE能够获得每个基因和潜在层面的贡献值,用于具体预测,以及基因和潜在层面之间的关联。还表明,XOmiVAE可以解释监督性分类和深层次学习网络的不受监督的集群结果,这不仅降低了可靠性,而且限制了从“黑箱”模型中获取新知识的可能性。在这里,我们展示了一种基于动态的深层次的深层次解释方法,我们从基于生物学的深度解释学和生物学的深度研究中,我们从所选择的深度生物感官学到重要的生物伦理学研究过程,我们从所选择的深度生物感官学的深层次和生物学学的深层次研究方法,我们展示了为重要的生物化学的深刻的深层次研究方法。