系统审查生物知情的癌症深层学习模式:对肿瘤数据进行编码和解释的基本趋势 (A systematic review of biologically-informed deep learning models for cancer: fundamental trends for encoding and interpreting oncology data)

There is an increasing interest in the use of Deep Learning (DL) based methods as a supporting analytical framework in oncology. However, most direct applications of DL will deliver models with limited transparency and explainability, which constrain their deployment in biomedical settings. This systematic review discusses DL models used to support inference in cancer biology with a particular emphasis on multi-omics analysis. It focuses on how existing models address the need for better dialogue with prior knowledge, biological plausibility and interpretability, fundamental properties in the biomedical domain. For this, we retrieved and analyzed 42 studies focusing on emerging architectural and methodological advances, the encoding of biological domain knowledge and the integration of explainability methods. We discuss the recent evolutionary arch of DL models in the direction of integrating prior biological relational and network knowledge to support better generalisation (e.g. pathways or Protein-Protein-Interaction networks) and interpretability. This represents a fundamental functional shift towards models which can integrate mechanistic and statistical inference aspects. We introduce a concept of bio-centric interpretability and according to its taxonomy, we discuss representational methodologies for the integration of domain prior knowledge in such models. The paper provides a critical outlook into contemporary methods for explainability and interpretabiltiy used in DL for cancer. The analysis points in the direction of a convergence between encoding prior knowledge and improved interpretability. We introduce bio-centric interpretability which is an important step towards formalisation of biological interpretability of DL models and developing methods that are less problem- or application-specific.

翻译：以深学为基础的方法作为肿瘤学方面的辅助分析框架,人们越来越有兴趣使用这种方法,然而,大多数直接应用DL将提供透明度和解释性有限的模型,从而限制在生物医学环境中的部署。这一系统审查讨论了用于支持癌症生物学推断的DL模型,特别强调多经济学分析;侧重于现有模型如何解决与先前知识、生物可信赖性和可解释性、生物医学领域的基本特性进行更好对话的需要。为此,我们检索并分析了42项研究,重点是新出现的建筑和方法进步、生物领域知识的编码和解释性应用方法的整合。我们讨论了最近DL模型的进化拱门,其方向是整合以前的生物关系和网络知识,以支持更全面地概括化(例如路径或Protein-Prointin-Interaction网络)和可解释性。这代表着一种根本的功能转变,即转向能够将机能和统计性不确定性纳入模型的模型。我们引入了生物中心解释性解释性概念的概念,并根据其分类学,我们讨论了在将可解释性化性方面采用的重要方法,我们讨论了将先前的可陈述性方法用于解释性、解释性、解释性、解释性、解释性、解释性、在先前的准确性分析中,这是一种解释性、解释性、在论文分析中采用前的精确性分析中的一种方法。