Diabetes affects over 400 million people and is among the leading causes of morbidity worldwide. Identification of high-risk individuals can support early diagnosis and prevention of disease development through lifestyle changes. However, the majority of existing risk scores require information about blood-based factors which are not obtainable outside of the clinic. Here, we aimed to develop an accessible solution that could be deployed digitally and at scale. We developed a predictive 10-year type 2 diabetes risk score using 301 features derived from 472,830 participants in the UK Biobank dataset while excluding any features which are not easily obtainable by a smartphone. Using a data-driven feature selection process, 19 features were included in the final reduced model. A Cox proportional hazards model slightly overperformed a DeepSurv model trained using the same features, achieving a concordance index of 0.818 (95% CI: 0.812-0.823), compared to 0.811 (95% CI: 0.806-0.815). The final model showed good calibration. This tool can be used for clinical screening of individuals at risk of developing type 2 diabetes and to foster patient empowerment by broadening their knowledge of the factors affecting their personal risk.
翻译:糖尿病影响到4亿多人,是全世界发病的主要原因之一。确定高风险个人可以支持通过改变生活方式及早诊断和预防疾病发展。然而,大多数现有风险评分要求获得诊所外无法获得的血液因素信息。在这里,我们的目标是开发一个可用数字和规模部署的无障碍解决方案。我们开发了一个10年10年2型糖尿病风险评分,使用来自联合王国生物银行数据集472,830名参与者的301个特征,但排除了智能手机不易获取的任何特征。使用数据驱动特征选择程序,在最后的降级模型中包括了19个特征。一个克斯比例风险模型略超了使用相同特征培训的深沙夫模型,实现了0.818(95%的CI:0.81-0.823)的一致指数,而0.811(95%的CI:0.806-0.815)。最后模型显示了良好的校准。这一工具可用于对可能患2型糖尿病的人进行临床检查,并通过扩大他们对影响其个人风险的因素的了解,促进病人的能力。