This manuscript addresses the simultaneous problems of predicting all-cause inpatient readmission or death after discharge, and quantifying the impact of discharge placement in preventing these adverse events. To this end, we developed an inherently interpretable multilevel Bayesian modeling framework inspired by the piecewise linearity of ReLU-activated deep neural networks. In a survival model, we explicitly adjust for confounding in quantifying local average treatment effects for discharge placement interventions. We trained the model on a 5% sample of Medicare beneficiaries from 2008 and 2011, and then tested the model on 2012 claims. Evaluated on classification accuracy for 30-day all-cause unplanned readmissions (defined using official CMS methodology) or death, the model performed similarly against XGBoost, logistic regression (after feature engineering), and a Bayesian deep neural network trained on the same data. Tested on the 30-day classification task of predicting readmissions or death using left-out future data, the model achieved an AUROC of approximately 0.76 and and AUPRC of approximately 0.50 (relative to an overall positively rate in the testing data of 18%), demonstrating how one need not sacrifice interpretability for accuracy. Additionally, the model had a testing AUROC of 0.78 on the classification of 90-day all-cause unplanned readmission or death. We easily peer into our inherently interpretable model, summarizing its main findings. Additionally, we demonstrate how the black-box posthoc explainer tool SHAP generates explanations that are not supported by the fitted model -- and if taken at face value does not offer enough context to make a model actionable.
翻译:本手稿解决了同时出现的问题,即预测出院后住院重新接纳或死亡的所有原因,并量化排放在防止这些不利事件方面的影响。为此目的,我们开发了一个内在的可解释的多层次贝叶斯模型框架,其灵感来自RELU激活的深神经网络的片段直线。在生存模型中,我们明确调整,在量化释放安置干预措施的当地平均治疗效果方面,对模型进行了混为一体的量化;我们从2008年和2011年起对医疗保健计划受益人的5%抽样进行了培训,然后对2012年索赔模型进行了测试。对30天全天全因重新接纳(使用正式的CMS方法定义)或死亡的分类准确性进行了评估。模型对30天的分类准确性进行了类似的解释,对XGBoost、物流回归(在特征工程后)和以同一数据培训的巴伊伊深神经网络进行了类似的分析。我们用左手模型测试了预测再定位或死亡的30天分类任务,该模型实现了可理解的AUROC,约0.76和约0.50的面值(相对于测试中整体的黑色模型,我们用不精确度数据进行了测试的计算,对18 %的精确度数据进行了解读的精确度数据进行了解读,我们做了一个解释,我们做了一个解释。