In cancer epidemiology, the \emph{relative survival framework} is used to quantify the hazard associated with cancer by comparing the all-cause mortality hazard in cancer patients to that of the general population. This framework assumes that an individual's hazard function is the sum of a known population hazard and an excess hazard associated with the cancer. Several estimands are derived from the excess hazard, including the \emph{net survival}, which are used to inform decisions and to assess the effectiveness of interventions on cancer management. In this paper, we introduce a Bayesian machine learning approach to estimating the excess hazard and identifying vulnerable subgroups, with a higher excess risk, using Bayesian additive regression trees (BART). We first develop a proportional hazards extension of the BART model to the relative survival setting, and then extend this model to non-proportional hazards. We develop tools for model interpretation and posterior summarization and then present an application using colon cancer data from England, highlighting the insights our proposed methodology offers when paired with state-of-the-art data linkage methods. This application demonstrates how these methods can be used to identify drivers of inequalities in cancer survival through variable importance quantification.
翻译:暂无翻译