外推:变量重要性要求至少多一个模型,或没有自由变量重要性 (Unrestricted Permutation forces Extrapolation: Variable Importance Requires at least One More Model, or There Is No Free Variable Importance)

This paper reviews and advocates against the use of permute-and-predict (PaP) methods for interpreting black box functions. Methods such as the variable importance measures proposed for random forests, partial dependence plots, and individual conditional expectation plots remain popular because they are both model-agnostic and depend only on the pre-trained model output, making them computationally efficient and widely available in software. However, numerous studies have found that these tools can produce diagnostics that are highly misleading, particularly when there is strong dependence among features. The purpose of our work here is to (i) review this growing body of literature, (ii) provide further demonstrations of these drawbacks along with a detailed explanation as to why they occur, and (iii) advocate for alternative measures that involve additional modeling. In particular, we describe how breaking dependencies between features in hold-out data places undue emphasis on sparse regions of the feature space by forcing the original model to extrapolate to regions where there is little to no data. We explore these effects across various model setups and find support for previous claims in the literature that PaP metrics can vastly over-emphasize correlated features in both variable importance measures and partial dependence plots. As an alternative, we discuss and recommend more direct approaches that involve measuring the change in model performance after muting the effects of the features under investigation.

翻译：本文评论并主张不要使用偏差和偏差法来解释黑盒功能。各种方法,例如为随机森林、部分依赖性地块和个别有条件期望地提出的不同重要措施,仍然很受欢迎,因为它们既是模型的不可知性,而且只依赖经过预先训练的模型产出,使它们在计算上效率很高,并在软件中广泛提供。然而,许多研究发现,这些工具可以产生极有误导性的诊断,特别是在各种特征之间高度依赖的情况下。我们在这里的工作的目的是:(一) 审查这一不断增长的文献集,(二) 进一步展示这些缺陷,并详细解释为何会出现这些缺陷,以及(三) 倡导采取涉及更多模型的替代措施。我们特别说明,由于数据中存在各种特征的脱节偏依赖性,造成对地貌空间稀少地区的过度强调,迫使原始模型外推至几乎没有数据的区域。我们探讨这些影响,并在文献中找到支持以前的说法,即PaP指标可大大地说明这些缺陷,同时详细解释为什么会出现这些缺陷;(三) 提倡采取替代措施,涉及更多的建模方法。我们讨论在进行部分依赖性分析之后如何衡量这些特征。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日