The aim of this study is to define importance of predictors for black box machine learning methods, where the prediction function can be highly non-additive and cannot be represented by statistical parameters. In this paper we defined a ``Generalized Variable Importance Metric (GVIM)'' using the true conditional expectation function for a continuous or a binary response variable. We further showed that the defined GVIM can be represented as a function of the Conditional Average Treatment Effect (CATE) squared for multinomial and continuous predictors. Then we propose how the metric can be estimated using using any machine learning models. Finally we showed the properties of the estimator using multiple simulations.
翻译:本研究的目的是界定黑盒机器学习方法预测器的重要性,在黑盒机器学习方法中,预测功能可以是高度非附加的,不能由统计参数来代表。在本文中,我们定义了“一般变重要计量(GVIM)”, 使用一个连续或二进反应变量的真正有条件的预期功能。 我们还进一步表明,定义的GVIM可以作为多元和连续预测器条件平均治疗效果平方的函数来代表。 然后,我们建议如何使用任何机器学习模型来估计该计量。 最后,我们用多个模拟来显示估计器的特性。