Despite the popularity of feature importance (FI) measures in interpretable machine learning, the statistical adequacy of these methods is rarely discussed. From a statistical perspective, a major distinction is between analyzing a variable's importance before and after adjusting for covariates - i.e., between $\textit{marginal}$ and $\textit{conditional}$ measures. Our work draws attention to this rarely acknowledged, yet crucial distinction and showcases its implications. Further, we reveal that for testing conditional FI, only few methods are available and practitioners have hitherto been severely restricted in method application due to mismatching data requirements. Most real-world data exhibits complex feature dependencies and incorporates both continuous and categorical data (mixed data). Both properties are oftentimes neglected by conditional FI measures. To fill this gap, we propose to combine the conditional predictive impact (CPI) framework with sequential knockoff sampling. The CPI enables conditional FI measurement that controls for any feature dependencies by sampling valid knockoffs - hence, generating synthetic data with similar statistical properties - for the data to be analyzed. Sequential knockoffs were deliberately designed to handle mixed data and thus allow us to extend the CPI approach to such datasets. We demonstrate through numerous simulations and a real-world example that our proposed workflow controls type I error, achieves high power and is in line with results given by other conditional FI measures, whereas marginal FI metrics result in misleading interpretations. Our findings highlight the necessity of developing statistically adequate, specialized methods for mixed data.
翻译:尽管在可解释的机器学习中采用地物重要性(FI)措施很受欢迎,但很少讨论这些方法在统计上是否充分的问题。从统计角度看,主要区别在于分析变量在调适共变之前和之后的重要性(即美元/textit{marginal}$和美元/textit{retal}$)。我们的工作提请人们注意这一点,这一点很少得到承认,但却是关键的区别,并展示了它的影响。此外,我们发现,在测试有条件的FI时,由于数据要求不匹配,只有很少的方法,而且实践者在方法应用方面受到严格限制。大多数真实世界数据都显示出复杂的地貌性依赖性,同时包括连续和绝对的数据(混合数据)。为了填补这一差距,我们建议将有条件的预测效果(CPI)框架与连续的敲击抽样结合起来。CIPI为数据分析数据提供了可靠的条件,因此产生了具有类似统计属性的合成数据。在CIFIFI中故意设计了多重和绝对性数据,因此我们通过模拟模式的模型将数据推广了其他类型的数据。</s>