Survival analysis models the distribution of time until an event of interest, such as discharge from the hospital or admission to the ICU. When a model's predicted number of events within any time interval is similar to the observed number, it is called well-calibrated. A survival model's calibration can be measured using, for instance, distributional calibration (D-CALIBRATION) [Haider et al., 2020] which computes the squared difference between the observed and predicted number of events within different time intervals. Classically, calibration is addressed in post-training analysis. We develop explicit calibration (X-CAL), which turns D-CALIBRATION into a differentiable objective that can be used in survival modeling alongside maximum likelihood estimation and other objectives. X-CAL allows practitioners to directly optimize calibration and strike a desired balance between predictive power and calibration. In our experiments, we fit a variety of shallow and deep models on simulated data, a survival dataset based on MNIST, on length-of-stay prediction using MIMIC-III data, and on brain cancer data from The Cancer Genome Atlas. We show that the models we study can be miscalibrated. We give experimental evidence on these datasets that X-CAL improves D-CALIBRATION without a large decrease in concordance or likelihood.
翻译:当模型在任何时间间隔内预计的事件数量与观察到的数量相似时,该模型被称为“精确校准”。生存模型的校准可以使用分布校准(D-Calibration)[Haider等人,2020年]进行测量,例如,分布校准(D-Calibration)[Haider等人,2020年]计算不同时间间隔内观察到和预测的事件数量之间的正方差。典型地说,在培训后分析中处理校准问题。我们开发了明确的校准(X-CAL),将D-Calibraization转换成一个不同的目标,可用于在最大可能性估计和其他目标的同时进行生存模型。X-CAL允许从业人员直接优化校准,在预测力和校准之间达到理想的平衡。在我们的实验中,我们在模拟数据上安装了各种浅度和深度模型,一个基于MNIST、使用MIMI-III数据进行长期预测、以及从癌症-CARC实验中生成的脑癌数据,我们可以在XIRIC实验性Atlas中改进这些模型。我们进行这种模型的概率降低。