Survival data with time-varying covariates are common in practice. If relevant, they can improve on the estimation of survival function. However, the traditional survival forests - conditional inference forest, relative risk forest and random survival forest - have accommodated only time-invariant covariates. We generalize the conditional inference and relative risk forests to allow time-varying covariates. We also propose a general framework for estimation of a survival function in the presence of time-varying covariates. We compare their performance with that of the Cox model and transformation forest, adapted here to accommodate time-varying covariates, through a comprehensive simulation study in which the Kaplan-Meier estimate serves as a benchmark, and performance is compared using the integrated L2 difference between the true and estimated survival functions. In general, the performance of the two proposed forests substantially improves over the Kaplan-Meier estimate. Taking into account all other factors, under the PH setting, the best method is always one of the two proposed forests, while under the non-PH setting, it is the adapted transformation forest. K-fold cross-validation is used as an effective tool to choose between the methods in practice.
翻译:具有时间变化的共变体生存数据在实践中是常见的。如果相关的话,它们可以改进对生存功能的估计。然而,传统生存森林――有条件的推断森林、相对风险森林和随机生存森林――只容纳时间变化的共变体。我们普遍采用有条件的推断和相对风险森林,以便允许时间变化的共变体生存。我们还提出了一个在时间变化的共变体存在的情况下估计生存功能的一般框架。我们比较了它们与Cox模型和转化森林的绩效,并在此进行调整,以适应时间变化的共变森林,通过全面模拟研究,卡普兰-梅耶的估计作为基准,而业绩则使用真实生存功能和估计生存功能之间的综合L2差异进行比较。一般来说,两个拟议的森林的绩效比卡普兰-梅尔的估计大得多。考虑到所有其他因素,在PH环境下,最佳方法始终是两种拟议森林之一,而在非PH环境下,它是经过调整的转化森林。在实践中,使用K倍交叉比较法作为选择方法的有效工具。