In this paper we define a population parameter, ``Generalized Variable Importance Metric (GVIM)'', to measure importance of predictors for black box machine learning methods, where the importance is not represented by model-based parameter. GVIM is defined for each input variable, using the true conditional expectation function, and it measures the variable's importance in affecting a continuous or a binary response. We extend previously published results to show that the defined GVIM can be represented as a function of the Conditional Average Treatment Effect (CATE) for any kind of a predictor, which gives it a causal interpretation and further justification as an alternative to classical measures of significance that are only available in simple parametric models. Extensive set of simulations using realistically complex relationships between covariates and outcomes and number of regression techniques of varying degree of complexity show the performance of our proposed estimator of the GVIM.
翻译:暂无翻译