An increasing number of model-agnostic interpretation techniques for machine learning (ML) models such as partial dependence plots (PDP), permutation feature importance (PFI) and Shapley values provide insightful model interpretations, but can lead to wrong conclusions if applied incorrectly. We highlight many general pitfalls of ML model interpretation, such as using interpretation techniques in the wrong context, interpreting models that do not generalize well, ignoring feature dependencies, interactions, uncertainty estimates and issues in high-dimensional settings, or making unjustified causal interpretations, and illustrate them with examples. We focus on pitfalls for global methods that describe the average model behavior, but many pitfalls also apply to local methods that explain individual predictions. Our paper addresses ML practitioners by raising awareness of pitfalls and identifying solutions for correct model interpretation, but also addresses ML researchers by discussing open issues for further research.
翻译:越来越多的机器学习模型(ML)模型(如部分依赖性地块(PDP)、变异特征重要性(PFI)和Shapley值提供了深刻的模型解释,但如果应用不当,则可能导致错误的结论。我们强调ML模型解释的许多一般缺陷,如在错误的情况下使用解释技术,解释不全面的模式,忽视高维环境中的特征依赖性、相互作用、不确定性估计和问题,或作出不合理的因果关系解释,并以实例来说明这些缺陷。我们注重描述平均模型行为的全球方法的缺陷,但许多缺陷也适用于解释个别预测的当地方法。我们的文件通过提高对陷阱的认识和找出正确模型解释的解决办法,向ML实践者讲解,但也通过讨论有待进一步研究的公开问题向ML研究人员讲。