Diagnostic and prognostic models are increasingly important in medicine and inform many clinical decisions. Recently, machine learning approaches have shown improvement over conventional modeling techniques by better capturing complex interactions between patient covariates in a data-driven manner. However, the use of machine learning introduces a number of technical and practical challenges that have thus far restricted widespread adoption of such techniques in clinical settings. To address these challenges and empower healthcare professionals, we present a machine learning framework, AutoPrognosis 2.0, to develop diagnostic and prognostic models. AutoPrognosis leverages state-of-the-art advances in automated machine learning to develop optimized machine learning pipelines, incorporates model explainability tools, and enables deployment of clinical demonstrators, without requiring significant technical expertise. Our framework eliminates the major technical obstacles to predictive modeling with machine learning that currently impede clinical adoption. To demonstrate AutoPrognosis 2.0, we provide an illustrative application where we construct a prognostic risk score for diabetes using the UK Biobank, a prospective study of 502,467 individuals. The models produced by our automated framework achieve greater discrimination for diabetes than expert clinical risk scores. Our risk score has been implemented as a web-based decision support tool and can be publicly accessed by patients and clinicians worldwide. In addition, AutoPrognosis 2.0 is provided as an open-source python package. By open-sourcing our framework as a tool for the community, clinicians and other medical practitioners will be able to readily develop new risk scores, personalized diagnostics, and prognostics using modern machine learning techniques.
翻译:诊断和预测模型在医学领域越来越重要,并且为许多临床决策提供信息。最近,机器学习方法通过更好地以数据驱动的方式捕捉病人共变体之间的复杂互动,表明比常规模型技术有所改善。然而,机器学习的使用带来了一些技术和实际挑战,迄今限制了在临床环境中广泛采用这种技术。为了应对这些挑战并赋予保健专业人员权力,我们提出了一个机器学习框架,即AutoPrognos 2.0,以开发诊断和预测模型。自动化预测利用了自动化机器学习的最新先进技术,以开发最优化的机器学习管道,采用模型解释工具,并能够部署临床示威者,而不需要重要的技术专门知识。然而,机器学习的使用带来了一些技术和实际挑战,迄今限制了临床环境中广泛采用这种技术。为了展示这些困难并增强医疗专业人员的能力,我们提出了一个说明性应用程序,用以利用英国生物库建立糖尿病预测性风险分数,这是对502,467名个人进行的一项未来研究。我们自动框架在开发糖尿病方面实现比专家临床风险分数更大的歧视,纳入了模型,并使得临床临床风险分数工具系统化。我们的风险分数分数是作为全球公开工具,通过互联网工具,用来进行。