Shapley values are today extensively used as a model-agnostic explanation framework to explain complex predictive machine learning models. Shapley values have desirable theoretical properties and a sound mathematical foundation in the field of cooperative game theory. Precise Shapley value estimates for dependent data rely on accurate modeling of the dependencies between all feature combinations. In this paper, we use a variational autoencoder with arbitrary conditioning (VAEAC) to model all feature dependencies simultaneously. We demonstrate through comprehensive simulation studies that our VAEAC approach to Shapley value estimation outperforms the state-of-the-art methods for a wide range of settings for both continuous and mixed dependent features. For high-dimensional settings, our VAEAC approach with a non-uniform masking scheme significantly outperforms competing methods. Finally, we apply our VAEAC approach to estimate Shapley value explanations for the Abalone data set from the UCI Machine Learning Repository.
翻译:光谱值如今被广泛用作解释复杂预测机器学习模型的模型-不可知解释框架。 光谱值在合作游戏理论领域具有理想的理论属性和健全的数学基础。 光谱值对依赖数据的精确估计依赖于所有特性组合之间依赖性的精确模型。 在本文中, 我们使用一个具有任意调节功能的变式自动编码器( VAEAC) 来同时模拟所有特性依赖性。 我们通过全面模拟研究来证明, 我们的光谱值估计方法在连续和混合依赖性特征的广泛环境中都优于最先进的方法。 对于高维度设置, 我们的光谱值估计方法与不统一的掩码方法大相径庭。 最后, 我们运用我们的VAEAC 方法来估计从 UCI 机器学习存储库中设定的单体数据中的光值解释。