The modeling of time-to-event data, also known as survival analysis, requires specialized methods that can deal with censoring and truncation, time-varying features and effects, and that extend to settings with multiple competing events. However, many machine learning methods for survival analysis only consider the standard setting with right-censored data and proportional hazards assumption. The methods that do provide extensions usually address at most a subset of these challenges and often require specialized software that can not be integrated into standard machine learning workflows directly. In this work, we present a very general machine learning framework for time-to-event analysis that uses a data augmentation strategy to reduce complex survival tasks to standard Poisson regression tasks. This reformulation is based on well developed statistical theory. With the proposed approach, any algorithm that can optimize a Poisson (log-)likelihood, such as gradient boosted trees, deep neural networks, model-based boosting and many more can be used in the context of time-to-event analysis. The proposed technique does not require any assumptions with respect to the distribution of event times or the functional shapes of feature and interaction effects. Based on the proposed framework we develop new methods that are competitive with specialized state of the art approaches in terms of accuracy, and versatility, but with comparatively small investments of programming effort or requirements for specialized methodological know-how.
翻译:模拟时间到活动数据的模型(又称生存分析)需要专门的方法,这些方法可以处理检查和抽查、时间变化特点和效果,并且可以扩展到多起相互竞争的事件。然而,许多关于生存分析的机器学习方法只考虑标准设置,并附有右层数据和比例危害假设。提供扩展的方法通常在多数情况下处理这些挑战的一个子集,往往需要无法直接纳入标准机器学习工作流程的专门软件。在这项工作中,我们提出了一个非常笼统的机器学习框架,用于利用数据增强战略减少复杂的生存任务到普瓦森标准回归任务的时间到时间到时间分析分析分析。这种重新拟订是基于完善的统计理论。根据拟议的方法,任何能够优化普瓦森(log)相似性的算法,例如梯度增强的树木、深层神经网络、基于模型的提振和更多可用于时间到时间到时间分析。在这项工作中,我们提出的技术不需要任何关于事件分配或专业化特征和互动功能的功能结构分析的假设,但不需要任何新的假设,而是要根据具有竞争力的方法和相对变化的方法,根据拟议的方式制定新的框架,而要制定具有竞争性的方法。