Survival modeling in healthcare relies on explainable statistical models; yet, their underlying assumptions are often simplistic and, thus, unrealistic. Machine learning models can estimate more complex relationships and lead to more accurate predictions, but are non-interpretable. This study shows it is possible to estimate hospitalization for congestive heart failure by a 30 seconds single-lead electrocardiogram signal. Using a machine learning approach not only results in greater predictive power but also provides clinically meaningful interpretations. We train an eXtreme Gradient Boosting accelerated failure time model and exploit SHapley Additive exPlanations values to explain the effect of each feature on predictions. Our model achieved a concordance index of 0.828 and an area under the curve of 0.853 at one year and 0.858 at two years on a held-out test set of 6,573 patients. These results show that a rapid test based on an electrocardiogram could be crucial in targeting and treating high-risk individuals.
翻译:医疗领域生存模型依赖可解释的统计模型;然而,它们的基本假设往往是简单化的,因此是不现实的。机器学习模型可以估计更复杂的关系,导致更准确的预测,但不可解释。本研究显示,有可能用30秒单导心电图信号来估计心血管衰竭的住院率。使用机器学习方法不仅可以产生更大的预测力,而且还可以提供具有临床意义的解释。我们训练了一种eXreme 加速加速故障时间模型,并利用Shanapley Additive Exposations值来解释每个特征对预测的影响。我们的模型达到了0.828的一致指数,在6 573名病人的悬停试验组上,两年达到0.853和0.858的曲线下的一个区域。这些结果显示,基于电心图的快速测试对于锁定和治疗高风险个人至关重要。