In the era of precision medicine, time-to-event outcomes such as time to death or progression are routinely collected, along with high-throughput covariates. These high-dimensional data defy classical survival regression models, which are either infeasible to fit or likely to incur low predictability due to over-fitting. To overcome this, recent emphasis has been placed on developing novel approaches for feature selection and survival prognostication. We will review various cutting-edge methods that handle survival outcome data with high-dimensional predictors, highlighting recent innovations in machine learning approaches for survival prediction. We will cover the statistical intuitions and principles behind these methods and conclude with extensions to more complex settings, where competing events are observed. We exemplify these methods with applications to the Boston Lung Cancer Survival Cohort study, one of the largest cancer epidemiology cohorts investigating the complex mechanisms of lung cancer.
翻译:在精密医学时代,随着高通量共变,经常收集时间到时间到死亡或逐渐死亡等时间到活动的结果,同时收集高通量的共变数据。这些高维数据与古典生存回归模型格格不入,这些模型要么不适合使用,要么由于过于适合而可能具有低可预测性。为了克服这一点,最近的重点是为特征选择和生存预测制定新办法。我们将审查与高维预测器一起处理生存结果数据的各种尖端方法,突出最近机器学习方法在生存预测方面的创新。我们将涵盖这些方法背后的统计直觉和原则,并结束这些方法的延伸至更复杂的环境,以观察相互竞争的事件。我们将这些方法推广到波士顿肺癌生存群研究中,这是调查肺癌复杂机制的最大癌症流行病学群之一。