The COVID-19 pandemic has created an urgent need for robust, scalable monitoring tools supporting stratification of high-risk patients. This research aims to develop and validate prediction models, using the UK Biobank, to estimate COVID-19 mortality risk in confirmed cases. From the 11,245 participants testing positive for COVID-19, we develop a data-driven random forest classification model with excellent performance (AUC: 0.91), using baseline characteristics, pre-existing conditions, symptoms, and vital signs, such that the score could dynamically assess mortality risk with disease deterioration. We also identify several significant novel predictors of COVID-19 mortality with equivalent or greater predictive value than established high-risk comorbidities, such as detailed anthropometrics and prior acute kidney failure, urinary tract infection, and pneumonias. The model design and feature selection enables utility in outpatient settings. Possible applications include supporting individual-level risk profiling and monitoring disease progression across patients with COVID-19 at-scale, especially in hospital-at-home settings.
翻译:COVID-19大流行造成迫切需要强有力的、可扩缩的监测工具,支持高危病人的分层,这项研究的目的是利用联合王国生物库开发和验证预测模型,以估计确诊病例中的COVID-19死亡率风险。在11 245名检测COVID-19呈阳性的参与者中,我们开发了一个数据驱动随机森林分类模型,其性能优异(AUC:0.91),使用基准特征、先前存在的状况、症状和生命迹象,使得分能够动态地评估疾病恶化造成的死亡率风险。我们还查明了几种COVID-19死亡率的新的重要预测器,其等值或预测值高于既定的高风险共发病率,例如详细的人体测量和先前急性肾衰竭、尿道感染和肺炎。模型设计和特征选择使门诊环境有用。可能的应用包括支持个人一级的风险分析,并监测在规模上COVID-19病人的疾病蔓延情况,特别是在医院。