Current practice in interpretable machine learning often focuses on explaining the final model trained from data, e.g., by using the Shapley additive explanations (SHAP) method. The recently developed Shapley variable importance cloud (ShapleyVIC) extends the current practice to a group of "nearly optimal models" to provide comprehensive and robust variable importance assessments, with estimated uncertainty intervals for a more complete understanding of variable contributions to predictions. ShapleyVIC was initially developed for applications with traditional regression models, and the benefits of ShapleyVIC inference have been demonstrated in real-life prediction tasks using the logistic regression model. However, as a model-agnostic approach, ShapleyVIC application is not limited to such scenarios. In this work, we extend ShapleyVIC implementation for machine learning models to enable wider applications, and propose it as a useful complement to the current SHAP analysis to enable more trustworthy applications of these black-box models.
翻译:可解释的机器学习的现行做法往往侧重于解释从数据中培训的最后模型,例如,使用沙普利添加解释(SHAP)方法。最近开发的沙普利可变重要云(ShapleyVIC)将目前的做法扩大到一组“近于最佳的模型”,以提供全面和稳健的可变重要评估,并估计不确定性间隔时间,以便更全面地了解对预测的可变贡献。ShapleyVIC最初是为传统回归模型的应用而开发的,ShaplyVIC推断的效益已经通过使用物流回归模型在现实生活中的预测任务中得到证明。然而,作为模型-不可知性方法,ShapleyVIC的应用并不局限于此类情景。在这项工作中,我们将ShapleyVIC对机器学习模型的实施扩大到更广泛的应用,并提议将其作为当前SHAP分析的有益补充,以便能够更可靠地应用这些黑盒模型。