Modern applications of survival analysis increasingly involve time-dependent covariates. In healthcare settings, such covariates provide dynamic patient histories that can be used to assess health risks in realtime by tracking the hazard function. Hazard learning is thus particularly useful in healthcare analytics, and the open-source package BoXHED 1.0 provides the first implementation of a gradient boosted hazard estimator that is fully nonparametric. This paper introduces BoXHED 2.0, a quantum leap over BoXHED 1.0 in several ways. Crucially, BoXHED 2.0 can deal with survival data that goes far beyond right-censoring and it also supports recurring events. To our knowledge, this is the only nonparametric machine learning implementation that is able to do so. Another major improvement is that BoXHED 2.0 is orders of magnitude more scalable, due in part to a novel data preprocessing step that sidesteps the need for explicit quadrature when dealing with time-dependent covariates. BoXHED 2.0 supports the use of GPUs and multicore CPUs, and is available from GitHub: www.github.com/BoXHED.
翻译:现代生存分析应用越来越多地涉及基于时间的共变体。 在医疗保健环境中, 这种共变体提供动态病人历史, 可以通过跟踪危害功能实时评估健康风险。 因此, 危险学习在医疗分析中特别有用, 开放源代码包BoxHED 1.0 提供了首次实施梯度推升危险估计值, 且该估计值完全不具有参数性。 本文引入了 BoXHED 2.0, 以几种方式比 BoXHED 1.0 跳跃。 关键是, BoxHED 2.0 能够处理远超出右检查范围的生存数据, 并且也支持经常性事件。 据我们了解, 这是唯一能够进行这种研究的非参数机器学习的。 另一个重大改进是, BoXHED 2. 0 是规模更大, 其部分原因是一个新的数据预处理步骤, 它在与基于时间的共变换变量打交道时, 绕过对明确二次量的二次变形的需要。 BoXHED 2. 0 支持使用 GPU 和多核心 CPU,, 并且 GitHub: www.gibbb. BO. BO. BO. /B. BOXED.