Shapley values are ubiquitous in interpretable Machine Learning due to their strong theoretical background and efficient implementation in the SHAP library. Computing these values previously induced an exponential cost with respect to the number of input features of an opaque model. Now, with efficient implementations such as Interventional TreeSHAP, this exponential burden is alleviated assuming one is explaining ensembles of decision trees. Although Interventional TreeSHAP has risen in popularity, it still lacks a formal proof of how/why it works. We provide such proof with the aim of not only increasing the transparency of the algorithm but also to encourage further development of these ideas. Notably, our proof for Interventional TreeSHAP is easily adapted to Shapley-Taylor indices and one-hot-encoded features.
翻译:在可解释的机器学习中,其价值是无处不在的,因为它们具有很强的理论背景,并且在SHAP图书馆中得到了有效的执行。计算这些价值之前,不透明的模型输入特性的数量会产生指数成本。现在,随着“干预树”等高效的实施,这一指数负担正在减轻,假设一个解释决策树的组合。虽然“干预树”越来越受欢迎,但它仍然缺乏它如何运作/为何运作的正式证据。我们提供这种证据的目的不仅是提高算法的透明度,而且还鼓励这些想法的进一步发展。值得注意的是,我们关于“干预树”的证明很容易适应Shapley-Taylor指数和一热码特征。