风险预测模型的完美信息的预期价值 (The Expected Value of Perfect Information for Risk Prediction Models)

Risk prediction models are often constructed using a finite development sample, thus the resulting predicted risks are uncertain. The decision-theoretic implications of prediction uncertainty are not sufficiently studied. For risk prediction models, a measure of net benefit can be calculated based on interpreting the positivity threshold as an exchange rate between true and false positive outcomes. Adopting a Bayesian perspective, we apply Value of Information concepts from decision theory to such net benefit calculations when developing a risk prediction model. We define the Expected Value of Perfect Information (EVPI) as the expected gain in net benefit by using the correct predictions as opposed to the proposed model. We suggest bootstrap methods for sampling from the posterior distribution of predictions for EVPI calculation using Monte Carlo simulations. In a case study, we used subsets of data of various sizes from a clinical trial to develop risk prediction models for 30-day mortality after acute myocardial infarction. With sample size of 1,000, EVPI was 0 at threshold values above 0.6, indicating no point in procuring more development data. At thresholds of 0.4-0.6, the proposed model was not net beneficial, but EVPI was positive, indicating that obtaining more development data might be beneficial. Across the entire thresholds, the gain in net benefit by using the correct model was 24% higher than the gain by using the proposed model. EVPI declined with larges sample sizes and was generally low with sample size of 4,000 and above. We summarize an algorithm for incorporating EVPI calculations into the commonly used bootstrap method for optimism correction. Value of Information methods can be applied to explore decision-theoretic consequences of uncertainty in risk prediction, and can complement inferential methods when developing or validating risk prediction models.

翻译：风险预测模型往往使用有限的发展抽样来构建,因此由此得出的预测风险是不确定的。预测不确定性的决策理论影响没有得到充分的研究。对于风险预测模型,可以将假设临界值解释为真实结果和假正结果之间的汇率来计算净效益。从巴伊西亚角度出发,我们在开发风险预测模型时,将决定理论中的“信息价值”概念应用于这种净效益计算。我们定义了完美信息的预期值(EVPI),作为使用正确预测而不是拟议模型而得出的预测不确定性的净收益。我们建议了从利用蒙特卡洛模拟进行EVPI计算预测的远端数值的远端分布上取样方法进行取样。在进行案例研究时,我们使用临床试验中不同大小的数据组来制定30天高心后死亡率的风险预测模型。在开发风险预测模型时,EVPI的预期值值为0.6以上,表明通过使用正确的数值来获取更多的发展指数数据。在使用0.4.0.4.0.6的阈值模型时,拟议的模型并非完全的效益,但使用平均数值的数值值值值值值值值值的数值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值,一般是正值的正值,因此,因此,在使用较值计算方法的数值计算得的数值的数值计算。