When using machine learning techniques in decision-making processes, the interpretability of the models is important. In the present paper, we adopted the Shapley additive explanation (SHAP), which is based on fair profit allocation among many stakeholders depending on their contribution, for interpreting a gradient-boosting decision tree model using hospital data. For better interpretability, we propose two novel techniques as follows: (1) a new metric of feature importance using SHAP and (2) a technique termed feature packing, which packs multiple similar features into one grouped feature to allow an easier understanding of the model without reconstruction of the model. We then compared the explanation results between the SHAP framework and existing methods. In addition, we showed how the A/G ratio works as an important prognostic factor for cerebral infarction using our hospital data and proposed techniques.
翻译:在决策过程中使用机器学习技术时,模型的可解释性很重要,在本文件中,我们采用了基于根据贡献在众多利害相关方之间公平分配利润的沙普利添加解释(SHAP),用于解释使用医院数据的梯度加速决策树模型。为了更好地解释,我们提出两种新颖技术如下:(1) 使用SHAP的新标准,具有特别重要性;(2) 称为特征包装的技术,将多个类似特征包装成一个组合特征,以便能够在不重建模型的情况下更容易理解模型。然后,我们比较了SHP框架和现有方法之间的解释结果。此外,我们展示了A/G比率如何使用医院数据和拟议技术作为脑干燥的重要预测因素。