利用统计或机器学习方法开发的临床预测模型的稳定 (Stability of clinical prediction models developed using statistical or machine learning methods)

Clinical prediction models estimate an individual's risk of a particular health outcome, conditional on their values of multiple predictors. A developed model is a consequence of the development dataset and the chosen model building strategy, including the sample size, number of predictors and analysis method (e.g., regression or machine learning). Here, we raise the concern that many models are developed using small datasets that lead to instability in the model and its predictions (estimated risks). We define four levels of model stability in estimated risks moving from the overall mean to the individual level. Then, through simulation and case studies of statistical and machine learning approaches, we show instability in a model's estimated risks is often considerable, and ultimately manifests itself as miscalibration of predictions in new data. Therefore, we recommend researchers should always examine instability at the model development stage and propose instability plots and measures to do so. This entails repeating the model building steps (those used in the development of the original prediction model) in each of multiple (e.g., 1000) bootstrap samples, to produce multiple bootstrap models, and then deriving (i) a prediction instability plot of bootstrap model predictions (y-axis) versus original model predictions (x-axis), (ii) a calibration instability plot showing calibration curves for the bootstrap models in the original sample; and (iii) the instability index, which is the mean absolute difference between individuals' original and bootstrap model predictions. A case study is used to illustrate how these instability assessments help reassure (or not) whether model predictions are likely to be reliable (or not), whilst also informing a model's critical appraisal (risk of bias rating), fairness assessment and further validation requirements.

翻译：临床预测模型估计个人特定健康结果的风险, 以其多个预测值为条件。发达模型是发展数据集和选定模型建设战略的结果, 包括抽样规模、预测数和分析方法( 如回归或机器学习 ) 。这里, 我们提出这样的关切, 许多模型是使用小型数据集开发的, 导致模型及其预测( 估计风险) 不稳定。我们定义了从整体平均值到个人水平的估计风险的四级模型稳定性。然后, 通过统计和机器学习方法的模拟和案例研究, 我们显示了模型估计风险的稳定性往往相当大, 最终表现为新数据预测的误差。因此, 我们建议研究人员应始终检查模型开发阶段的不稳定性, 并提出不稳定性图案和措施。这意味着要重复模型构建步骤( 用于原始预测模型模型的模型 ) 从总体平均值到原始指数( 1000) 靴陷阱评估样本, 以产生多重的帮助模型模型, 并且随后得出( ) 靴子的估算值的准确性预估测模型和原始模型的精确性预测( 轴 ) 模型的精确性模型的精确性模型的校正值预测( ) 是原始模型的精确的校正的模型的精确的校正( ) 模型的校正( ) 的模型的模型的校正的模型的模型的校正的模型的推的模型的模型的模型的模型的模型的精确的推的精确的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的推的

相关内容

MoDELS

关注 0

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日